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Calibration of instruments and standards is a refined form of measurement. Measure- 
ment of some property of a thing is an operation that yields as an end result a ninnber that 
indicates how much of the property the thing has. Measurement is ordinarily a repeatable 
operation, so that it is appropriate to regard measurement as a production process, the 
"product" being the numbers, i.e., the measurements, that it yields; and to apply to meas- 
urement processes in the laboratory the concepts and techniques of statistical process control 
that have proved so useful in the quahty control of industrial production. 

Viewed thus it becomes evident that a particular measurement operation cannot be 
regarded as constituting a measuremc^it process \mless statistical stability of the type 
known as a state of statistical control has been attained. In order to determine whether 
a particular measurement operation is, or is not, in a state of statistical control it is neces- 
sary to be definite on whsxt variations of procedure^ apparatus, environmental conditions, 
observers, operators, etc., are allowable in ''repeated applications" of what will be consid- 
ered to be the same measurement process applied to the measurement of the same quantity 
under the same conditions. To be realistic, the ''allowable variations" must be of sufficient 
scope to bracket the circumstances likely to be met in practice. Furthermore, any experi- 
mental program that aims to determine the standard deviation of a measurement process 
as an indication of its i^recision, must be based on appropriate random sampling of this 
likely range of circumstances. 

Ordinarily the accuracy of a measurement process may be characterized by giving (a) 
the standard deviation of the process and (b) credible bounds to its hkely overall system- 
atic error. Determination of credible bounds to the combined effect of recognized poten- 
tial sources of systematic error always involves some arbitrariness, not only in the placing 
of reasonable bounds on the systematic error likely to be contributed by each particular 
assignable cause, but also in the manner in which these individual contributions are com- 
bined. Consequently, the "inaccuracy" of end results of measurement cannot be ex- 
pressed by "confidence limits" corresponding to a definite numerical "confidence level," 
except in those; rare instances in which the possible overall systematic error of a final result 
is negligible in comparison with its imprecision. 



1. Introduction 

Calibration of iiistruineiits and standards is 
basically a refined form of measurement. Measure- 
ment is the assignment of numbers to material 
thinos to represent the relations existing among 
them with respect to particular properties. One 
always measures properties of things, not the things 
themselves. In practice, measurement of some 
property of a thing ordinarily takes the form of a 
sequence of steps or operations that yields as an end 
result a number that indicates how much of this 
property the tiling has, for someone to use for a 
specific purpose. The end result may be the out- 
come of a single reading of an instrument. More 
often it is some kind of average, e.g., the arithmetic 
mean of a munber of independent determinations of 
the same jnagnitude, or the final result of a least 
squares ' 'reduction'' of measurements of a nund)er 
of different quantities that bear known relations to 
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each other in accordance with a definite experimental 
plan. In general, the purpose for which the answer 
is needed determines the accuracy required and 
ordinarily also the method of measm-ement employed. 

Specification of the apparatus and auxiliary 
equipment to be used, the operations to be performed, 
the sequence in which they are to be executed, and 
the conditions under which they are respectively to 
be carried out — these instructions collectively serve 
to define a method of measurement. A measure- 
ment process is the realization of a method of 
measurement in terms of particular apparatus and 
equipment of the prescribed kinds, particular condi- 
tions that at best only approximate the conditions 
prescribed, and particular persons as operators and 
observers. 

It has long been recognized that, in undertaking 
to apply a particular method of measurement, a 
degree of consistency among repeated measurements 
of a single quantity needs to be attained before the 
method of measurement concerned can be legarded 
as meaningfully realized, i.e., before a measurement 
process can be said to have been established that is 
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a realization of the method of measurement con- 
cerned. Indeed, consistency or statistical stability 
of a very special kind is required: to qualify as a 
measurement process a measurement operation must 
have attained what is known in industrial quality 
control language as a state of statistical control. 
Until a measurement operation has been ^ 'debugged'^ 
to the extent that it has attained a state of statistical 
control it cannot be regarded in any logical sense as 
measuring anything at all. And when it lias attained 
a state of statistical control there may still remain 
the question of whether it is faithful to the method 
of measurement of which it is intended to be a 
realization. 

The systematic error, or bias, of a measurement 
process refers to its tendency to measure something 
other than what was intended; and is determined by 
the magnitude of the difference ii-t between the 
process average or limiting mean ^ associated with 
measurement of a particular quantity by the 
measurement process concerned and the true value 
r of the magnitude of this quantity. On first 
thought, the ''true value'' of the magnitude of a 
particular quantity appears to be a simple straight- 
forward concept. On careful analysis, however, it 
becomes evident that the '^true value'' of the magni- 
tude of a quantity is intimately linked to the pur- 
poses for which knowledge of the magnitude of this 
quantity is needed, and cannot, in the final analysis, 
be meaningfully and usefully defined in isolation 
from these needs. 

The precision of a measurement process refers to, 
and is determined by the degree of mutual agree- 
ment characteristic of independent measurements of 
a single quantity yielded by repeated applications 
of the process under specified conditions; and its 
accuracy refers to, and is determined by, the degree 
of agreement of such measurements with the true 
value of the magnitude of the quantity concerned. 
In brief ''accuracy" has to do with closeness to the 
truth; ^'precision," only with closeness together. 

Systematic error, precision, and accuracy are in- 
herent characteristics of a measurement process and 
not of a particular measurement yielded by the 
process. We may also speak of the systematic error, 
precision, and accuracy of a particular method of 
measurement that has the capability of statistical 
control. But these terms are not defined for a meas- 
urement operation that is not in a state of statistical 
control. 

The precision, or more correctly, the imprecision 
of a measurement process is ordinarily summarized 
by the standard deviation of the process, which ex- 
presses the characteristic disagreement of repeated 
measurements of a single quantity by the process 
concerned, and thus serves to indicate by how much 
a particular measurement is likely to differ from other 
values that the same measurement process might 
have provided in this instance, or might yield on re- 
measurement of the same quantity on another occa- 
sion. Unfortunately, there does not exist any single 
comprehensive measure of the accuracy (or maccu- 
racy) of a measurement process analogous to the 
standard deviation as a measure of its imprecision. 



To characterize the accuracy of a measurement 
process it is necessary, therefore, to indicate (a) its 
systematic error or bias, (b) its precision (or impre- 
cision) — and, strictly speaking, also, (c) the form of 
the distribution of the individual measurements 
about the process average. Such is the unavoidable 
situation if one is to concern one's self with indi- 
vidual measurements yielded by any particular meas- 
urement process. Fortunately, however, "final 
results" are ordinarily some kind of average or ad- 
justed value derived from a set of independent 
measurements, and when four or more independent 
measurements are involved, such adjusted values 
tend to be normally distributed to a very good ap- 
proximation, so that the accuracy of such final results 
can ordinarily be characterized satisfactorily by in- 
dicating (a) their imprecision as expressed by their 
standard error, and (b) the systematic error of the 
process by which they were obtained. 

The error of any single measurement or adjusted 
value of a particular quantity is, by definition, the 
difference between the measurement or adjusted 
value concerned and the true value of the magnitude 
of this quantity. The error of any particular meas- 
urement or adjusted value is, therefore, a fixed num- 
ber; and this number will ordinarily be unknown and 
unknowable, because the true value of the magnitude 
of the quantity concerned is ordinarily unknown and 
unknowable. Limits to the error of a single meas- 
urement or adjusted value may, however, be in- 
ferred from (a) the precision, and (b) bounds on the 
systematic error of tlie measurement process by 
which it was produced — but not witliout risk of being 
incorrect, because, quite apart from the inexactness 
with which bounds are commonly placed on a sys- 
tematic error of a measurement process, such limits 
are applicable to the error of the single measurement 
or adjusted value, not as a unique individual out- 
come, but only as a typical case of tlie errors charac- 
teristic of such measurements of tlie same quantity 
that might have been, or might be, yielded by the 
same measurement process under the same condi- 
tions. 

Since the precision of a measurement process is de- 
termined by the characteristic "closeness together" 
of successive independent measurements of a single 
magnitude generated by repeated application of the 
process under specified conditions, and its bias or 
systematic error is determined by the direction and 
amount by which such measurements tend to differ 
from the true value of the magnitude of the quantity 
concerned, it is necessary to be clear on what varia- 
tions of procedure, apparatus, environmental con- 
ditions, observers, etc., are allowable in "repeated 
applications" or what will be considered to be the 
same measurement process applied to the measure- 
ment of the same quantity under the same conditions. 
If whatever measures of the precision and bias of a 
measurement process we may adopt are to provide 
a realistic indication of the accuracy of this process in 
practice, then the "allowable variations" must be of 
suffi-cient scope to bracket the range of circumstances 
commonly met in practice. Furtliermore, any ex- 
perimental program that aims to determine the pre- 
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cision, and tlience tlie accuracy of a measurement 
process, must be based on an appropriate random 
sam]:)bn^' of this '^range of circumstances/' if the 
usual tools of statistical analysis are to be strictly 
applica])le. 

When adequate I'nndom sampling of the appro- 
priate '^j'ano'e of cii'cumstances" is not feasible, or 
even ])ossil)le, then it is necessary (a) to compute, by 
extrapolation from available data, a more or less 
subjective estimate of the precision of the measure- 
ment process concerned, to serve as a substitute for 
a direct experimental measure of this characteristic, 
and (b) to assign more or less subjective bounds to 
the systematic error of the measurement process. 
To the extent that such at least partially subjective 
computations are involved, the resulting evaluation 
of the overall accuracy of a measurement process 
^'is based on subject-matter knowledge and skill, 
general information, and intuition — but not on sta- 
tistical methodology^' [Cochran et al. 1953, p. 693]. 
Consequently, in such cases the statistically precise 
concept of a family of 'Confidence intervals" asso- 
ciated with a definite ' 'confidence level" or ''confidence 
coefficient" is not a])plicable. 

The foregoing points and certain other related 
matters are discussed in greater detail in tlu^ suc- 
ceeding sections, together with an indication of 
procedures for the realistic evaluation of precision 
and accuracy of established procedures for the 
cahbration of instruments and standards that mini- 
mize as much as possible the subjective elements of 
such an evaluation. To the extent that complete 
elimination of the subjective element is not always 
possible, the responsibility for an important and 
sometimes the most difficult part of the evaluation 
is shifted from the shoulders of the statistician to 
the shoulders of the subject matter "expert." 

2. Measurement 

2.1. Nature and Object 

Measurement is the assignment of mnnbers to 
material things to represent the relations existing 
among them with respect to particular properties. 
The number assigned to some particular property 
serves to represent the relative amount of this prop- 
erty associated with the object concerned. 

Measurement always pertains to properties of 
things, not to the things themselves. Thus we 
cannot measure a meter bar, but can and usually 
do, measure its length; and we could also measure its 
mass, its density, and perhaps, also its hardness. 

The object of measurement is twofold: first, sym- 
bolic representation of properties of things as a 
basis for conceptual analysis; and second, to effect 
the representation in a form amenable to the power- 
ful tools of mathematical analysis. The decisive 
feature is symbolic representation of properties, for 
which end numerals are not the only usable symbols. 

In practice the assignment of a numerical magni- 
tude to a particular property of a thing is ordinarily 
accomplished by comparison with a set of standards, 
or by comparison either of the quantit}^ itself, or of 



some transform of it, with a previously calibrated 
scale. Thus, lengtli Jiieasurements are usually made 
by directly comparing tlie length concerned with a 
calibrated bar or tape; and mass measurements, by 
directly comparing the weight of a given mass with 
the weiglit of a set of standard masses, by means of 
a balance; but force measurements are usually 
carri(Hl out in terms of some transform, sucli as by 
reading on a calibrated scale the extension that the 
force produces in a spring, or the deflection that it 
produces in a proving ring; and temperature measure- 
ments are usually performed in terms of some trans- 
form, such as by reading on a calibrated scale the 
expansion of a column of mercur}^, or the electrical 
resistance of a platinum wire. 

2.2. Qualitative and Quantitative Aspects 

As Walter A. Shewhart, father of statistical con- 
trol charts, has remarked: 

^'It is important to realize . . . that there arc two aspects 
of an operation of measurenunit; one is quantitative and the 
other (luaHtative. One consists of numbers or pointer read- 
ings such as the observed lengths in n measurements of the 
length of a line, and the other consists of the phymcal manipu- 
lations of physical things by so7neone in accord with instruc- 
tions that we shall assume to be describable in words con- 
stituting a text." [Slu^whart 1939, p. 180.] 

More specifically, tlie qualitative factors involved 
in the measurenu^nt of a quantity are: the apparatus 
and auxiliary eciuipment (e.g., reagents, batteries or 
other source of electrical energy, etc.) employed; 
the operators and observers, if any, involved; the 
operations performed, together with the sequence in 
which, and the conditions under which, they are 
respectively carried out. 

2.3. Correction and Adjustment of Observations 

The numbers obtained as "readings'' on a cali- 
brated scale are ordinarily the end product of every- 
day measurement in the trades and in the home. 
In scientific work there are usually two important 
additional quantitative aspects of measurement: 
(1) correction of the readings, or their transforms, to 
compensate for known deviations from ideal execu- 
tion of the prescribed operations, and for non- 
negligible effects of variations in uncontrolled vari- 
ables; and (2) adjustment of ''raw" or corrected 
measurements of particular quantities to obtain 
values of these quantities that conform to restric- 
tions upon, or interrelations among, the magnitudes 
of these quantities imposed by the nature of the 
problem . 

Thus, it may not be practicable or economically 
feasible to take readings at exactl}^ the prescribed 
temperatures; but quite practicable and feasible to 
bring and hold the temperature within narrow neigh- 
borhoods of the prescribed values and to record the 
actual temperatures to which the respective readings 
correspond. In such cases, if the deviations from the 
prescribed temperatures are not negligible, ''temper- 
ature corrections'^ based on appropriate theory are 
usually applied to the respective readings to bring 
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them to the values that presumably would have been 
observed if the temperature in each instance had 
been exactly as prescribed. 

In practice, however, the objective just stated is 
rarely, if ever, actually achieved. Any ^temperature 
corrections'' applied could be expected to bring the 
respective readings ^'to the values that presumably 
would have been observed if the temperature in each 
instance had been exactly as prescribed" if and only 
if these ^'temperature corrections" made appropriate 
allowances for all of the effects of the deviations of 
the actual temperatures from those prescribed. 
^'Temperature corrections" ordinarily correct only 
for particular effects of the deviations of the actual 
temperatures from their prescribed values; not for all 
of the effects on the readings traceable to deviations 
of the actual temperatures from those prescribed. 
Thus Michelson utilized ^'temperature corrections" in 
his 1879 investigation of the speed of light; but his 
results exhibit a dependence on temperature after 
''temperature correction." The "temperature cor- 
rections" applied corrected only for the effects of 
thermal expansion due to variations in temperature 
and not also for changes in the index of refraction of 
the air due to changes in the humidity of the air, 
which in June and July at Annapolis is highh^ cor- 
related with temperature. Corrections applied in 
practice are usually of more limited scope than the 
names that they are given appear to indicate. 

Adjustment of observations is fundamentally 
different from their "correction." When two or more 
related quantities are measured individually, the 
resulting measured values usually fail to satisfy the 
constraints on their magnitudes implied b}^ the given 
interrelations among the quantities concerned. In 
such cases these "raw" measured values are mutually 
contradictory, and require adjustment in order to be 
usable for the purpose intended. Thus, measured 
values of the three cyclic differences {A—B), (B—C), 
and (C—A) between the lengths of three nominally 
equivalent gage blocks are mutually contradictory, 
and strictly speaking are not usable as values of 
these differences, unless they sum to zero. 

The primary goal of adjustment is to derive from 
such inconsistent measurements, if possible, adjusted 
values for the quantities concerned that do satisfy the 
constraints on their magnitudes imposed by the 
nature of the quantities themselves and by the 
existing interrelations among them. A second objec- 
tive is to select from all possible sets of adjusted 
values the set that is the "best" — or, at least, a set 
that is "good enough" for the intended purpose — in 
some well-defined sense. Thus, in the above case of 
the measured differences between the lengths of 
three gage blocks, an adjustment could be effected 
by ignoring the measured value of one of the differ- 
ences entirely, say, the difference (C—A), and taking 
the negative of the sum of the other two as its 
adjusted value, 

Adj(C-A)^~[(A-B) + {B-C)l 

This will certainly assure that the sum of all three 
values, {A-B) + {B-C)+Adj(C-A), is zero, as 
required, and is clearly equivalent to ascribing all of 



the excess or deficit to the replaced measurement, 
(C—A). Alternatively, one might prefer to dis- 
tribute the necessary total adjustment —[(A—B) 
+ (B—C) + (0—A)] equally over the individual 
measured differences, to obtain the following set of 
adjusted values: 



Adj(A-B) = (A- 



-B)-^[(A- 



-B) + (B-Q + (C-A)] 



=^[2iA-B)-{B-C)~iC-A)] 
Adj{B-C)=l[2{B-C)-(A-B)-{C-A)] 



Adj {C-A)=^[2{C- 



-A)-(A~B)-{B-C)]. 



Clearly, the sum of these three adjusted values must 
always be zero, as required, regardless of the values 
of the original individual measured differences. 
Furthermore, most persons, I beheve, would con- 
sider this latter adjustment the better; and under 
certain conditions with respect to the "law of error" 
governing the original measured differences, it is 
indeed the "best." 

Note that no adjustment problem existed at 
the stage when only two of these differences had 
been measured whichever they were, for then the 
third could be obtained by subtraction. As a 
general principle, when no more observations are 
taken than are sufficient to provide one value of 
each of the unknown quantities involved, then the 
results so obtained are usable at least — they may 
not be "best." On the other hand, when additional 
observations are taken, leading to '^over determina- 
tion" and consequent contradiction of the funda- 
mental properties of, or the basic relationships among 
the quantities concerned, then the respective obser- 
vations must be regarded as contradicting one 
another. When this happens the observations 
themselves, or values derived from them, must be 
replaced by adjusted values such that all contradic- 
tion is removed. "This is a logical necessity, since 
we cannot accept for truth that which is contradic- 
tory or leads to contradictory results." [Chauvenet 
1868, p. 472.] 

2.4. Scheduling the Taking of Measurements 

Having done what one can to remove extraneous 
sources of error, and to make the basic measurements 
as precise and as free from systematic error as pos- 
sible, it is frequently possible not only to increase 
the precision of the end results of major interest but 
also to simultaneously decrease their sensitivity to 
sources of possible systematic error, by careful 
scheduling of the measurements required. An 
instance is provided by the traditional procedure for 
calibrating liquid-in-glass thermometers [Waidner 
and Dickinson 1907, p. 702; NPL 1957, pp. 29-30; 
Swindells 1959, pp. 11-12]: Instead of attempting to 
hold the temperature of the comparison bath con- 
stant, a very difficult objective to achieve, the heat 
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input to tlie batli is so adjusted that its temperature 
is slowly increasing at a steady rate, and then read- 
ings of, say, four test thermometers and two 
standards are taken in accorchmce with the schedule 

s/rrivivr-iS.s.TjvroT.Si 

the readings being spaced unil'ormly in time so that 
the arithmetic mean of the two readings of any one 
thermometer will correspond to the temperature of 
the comparison batli at the midpoint of the period. 
Such scheduling of measurement taking operations so 
that the effects of the specific types of departures 
from perfect control of conditions and procedure will 
have an opportunity to balance out is one of the 
principal aims of the art and science of statistical 
design of experiments. For additional physical 
science examples, see, for instance, Youden [1951a; 
and 1954-1959]. 



2.5. Measurement as a Production Process 

We may summarize our discussion of measurement 
up to this point, as follows: Measui'ement of some 
property of a thing in practice always takes the form 
of a sequence of steps or operations that yield as an 
end result a number that serves to represent the 
amount or quantity of some particular property of a 
tiling — a number that indicates how nmch of this 
property the thing has, for someone to use for a 
specific purpose. The end result may be the out- 
come of a single reading of an instrument, with or 
without corrections for departures from prescribed 
conditions. More often it is some kind of average 
or adjusted value, e.g., the arithmetic mean of a 
number of independent determinations of the same 
magnitude, or the final result of, say, a least squares 
'^reduction'' of measurements of a number of different 
quantities that have known relations to the quantity 
of interest. 

Measurement of some property of a thing is ordi- 
narily a repeat able operation. This is certainly the 
case for the types of measurement ordinarily met in 
the calibration of standards and instruments. It is 
instructive, therefore, to regard measurement as a 
production process, the ''product" being the numbers, 
that is, the measurements that it yields; and to com- 
pare and contrast measurement processes in the 
laboratory with mass production processes in indus- 
try. For the moment it will suffice to note (a) that 
when successive amounts of units of ''raw material" 
are processed by a particular mass production 
process, the output is a series of nominally identical 
items of product — of the particular type produced 
by the mass production operation, i.e., by the 
method of production concerned; and (b) that when 
successive objects are measured by a particular 
measurement ])rocess, the individual items of "prod- 
uct" produced consist of the numbers assigned to 
the respective objects to represent the relative 
amounts that they possess of the property deter- 
mined by the method of measurement involved. 



2.6. Methods of Measurement and Measurement 
Processes 

Specification of the apparatus and auxiliary equip- 
ment to be used, the operations to be performed, the 
sequence in which they are to be carried out, and the 
conditions under which they are respectively to be 
carried out — ^these instructions collectively serve to 
define a method of measurement. To the extent that 
corrections may be required they are an integral part 
of measurement. The types of corrections that will 
ordinarily need to be made, and specific procedures 
for making them, should be included among "the 
operations to be performed." Likewise, the essen- 
tial adjustments required should be noted, and 
specific procedures for making them incorporated in 
the specification of a method of measurement. 

A measurement process is the realization of a 
method of measurement in terms of particular 
apparatus aiul equipment of the prescribed kinds, 
particular coiuhtions that at best only approximate 
tlie conchtions prescribed, and particular })ersons as 
operators and observers [ASTM 1961, p. 1758; 
Murphy 1961, p. 264 1. Of course, there will often 
be a question whether a particular measurement 
process is loyal to the method of measurement of 
which it is intended to be a realization; or whether 
two (lifTerent measurement processes can be con- 
sidered to be realizations of the same method of 
measurement. 

To begin with, written specifications of methods 
of measurement often contain absolutely precise 
instructions which, however, cannot be carried out 
(repeatedly) with complete exactitude in practice; 
for example, "move the two parallel cross hairs of the 
micrometer of the microscope until the graduation 
line of the standard is (^entered between them." Tlie 
accuracy with which such instructions can be carried 
out in practice will always (lej)en(l upon "the cir- 
cumstances"; in the case cited, on the skill of the 
operator, the quality of the graduation line of the 
standard, the (luality of the screw of the micrometer, 
the parallelism of the cross hairs, etc. To the extent 
that the written specification of a method of measure- 
ment involves absolutely precise instructions that 
cannot be carried out with complete exactitude in 
practice there are certain to be discrepancies between 
a method of measurement and its realization by a 
particular measurement process. 

In addition, the specification of a method of 
measurement often includes a number of imprecise 
instructions, such as "raise the temperature slowly," 
"stir well before taking a reading," "make sure that 
the tubing is clean," etc. Not only are such in- 
structions inherently vague, but also in any given 
instance they must be understood in terms of the 
general level of refinement characteristic of the 
context in which they occur. Thus, "make sure that 
the tubing is clean" is not an absolutely definite in- 
struction; to some people this would nu\an simply 
that the tubing should be clean enough to drink 
liquids through; in some laboratory work it might be 
interpreted to mean mechanically washed and 
scoured so as to be free from dirt and other ordinary 
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solid matter (but not cleansed also with chemical 
solvents to remove more stubborn contaminants); 
to an advanced experimental physicist it may mean 
not merely mechanically washed and chemically 
cleansed, but also ''out gassed" by being heated to 
and held at a high temperature, near the softening 
point, for an hour or so. All will agree, I believe, 
that it would be exceedingly difficult to make such 
instructions absolutely definite with a convenient 
number of words. To the extent that the specifica- 
tion of a method of measurement includes instruc- 
tions that are not absolutely definite, there will be 
room for differences between measurement processes 
that are intended to be realization of the very same 
method of measurement. 

Recognition of the difficulty of achieving absolute 
definiteness in the specification of a method of 
measurement does not imply that ''any old set" of 
instructions will serve to define a method of measure- 
ment. Quite the contrary. To qualify as a specifi- 
cation of a method of measurement, a set of instruc- 
tions must be sufficiently definite to insure statistical 
stability of repeated measurements of a single 
quantity, that is, derived measurement processes 
must be capable of meeting the criteria of statistical 
control [Shewhart 1939, p. 131; Murphv 1961, p. 265; 
ASTM 1961, p. 1758]. To elucidation of the mean- 
ing of, and need for this requirement we now turn. 

3. Properties of Measurement Processes 

3.1. Requirement of Statistical Control 

The need for attaining a degree of consistency 
among repeated measurements of a single quantity 
before the method of measurement concerned can be 
regarded as meaningful has certainly been recognized 
for a long, long tune. Thus Galileo, describing his 
famous experiment on the acceleration of gravity 
in which he allowed a ball to roll different distances 
down an inclined plane wrote: 

". . . si lasciava (como dico) scendere per il detto canale 
la palla, notando, nel modo che appresso diro, il temp che 
consumava nello scorrerlo tutto, replicando il medesimo atto 
molte volte per assicurarsi bene della quantita del temp, nel 
quale non si trovava mai differenza ne anco della decima parte 
d'una battuta di polso. Fatta e stabilita precisamente tale 
operazione, facemmo scender la medisima palla solamente per 
la quarta parte della limghezza di esso canale . . . ^' ^ 
[Galileo 1638, Third Day; NatT. ed., p. 213.] 

Something more than mere ^ ^consistency^' is re- 
quired, however, as Shewhart points out eloquently 
in his very important chapter on ''The Specification 
of Accuracy and Precision'^ [Shewhart 1939, ch. IV]. 
He begins by noting that the description given by 
R. A. Millikan [1903, pp. 195-196] of a method for 
determining the surface tension T of a liquid from 
measurements of the force of tension F of a film of 



1 1 am grateful to my colleasue Ugo Fano for the followinsf literal translation: 
". . . we let, as I was saying, the ball descend through said chaimel, record- 
ing, in a manner presently to be descril)ed, the time it took in traversing it all, 
repeating the same action many times to make really sure of the magnitude of 
time . in which one never found a difference of even a tenth of a pulsebeat. Hav- 
ing done and established precisely such operation, we let the same ball descend 
only for the fourth part of the length of the same channel; . . ." 



the liquid contains the following instruction with 
regard to the basic readings from which measure- 
ments of F are derived: "Continue this operation 
until a number of consistent readings can be ob- 
tained." Shewhart then comments on this as 
follows: 

^\ . . the text describing the operation does not say to 
carry out such and such physical operations and call the 
result a measurement of T. Instead, it says in effect not to 
call the result a measurement of T until one has attained a 
certain degree of consistency among the observed values of 
F and hence among those of T. Although this requirement is 
not always explicitly stated in spc^cifications of the operation 
of measurements as it was here, I think it is always implied. 
Likewise, I think it is always assumed that there can be too 
much consistency or uniformity among the observed values 
as, for example, if a large number of measurements of the 
surface tension of a liquid were found to be identical. What 
is wanted but not explicitly described is a specific kind and 
degree of consistency. 

''. . . it should be noted that the advice to repeat the 
operation of measuring surface tension until a number of 
consistent readings have been obtained is indefinite in that it 
does not indicate how many readings shall be taken before 
applying a test for consistency, nor what kind of test of 
consistency is to be applied to the numbers or pointer read- 
ings .... One of the objects of this chapter is to see how 
far one can go toward improving this situation by providing 
an operationally definite criterion that preliminary observa- 
tions must meet before they are to be considered consistent 
in the sense implied in the instruction cited above. 

'^Before doing this, however, we must give attention not 
so much to the consistency of the n observed values already 
obtained by n repetitions of the operation of measurement as 
we do to the reproducibility of the operation as determined by 
the numbers in the potentially infinite sequence corresponding 
to an infinite number of repetitions of this operation. No 
one would care very much how consistent the first n prehmi- 
nary observations were if nothing could be validly inferred 
from this as to what future observations would show. Hence, 
it seems to me that the characteristics of the numerical as- 
pects of an operation that is of greatest practical interest is 
its reproducibility within tolerance liynits throughout the infinite 
sequence. The limit to which we may go in this direction is 
to attain a state of statistical control. The attempt to 
attain a certain kind of consistency within the first n ob- 
served values is merely a means of attaining reproducibility 
within limits throughout the whole of the sequence." 
[Shewhart 1939, pp. 131-132.] 

The point that Shewhart makes forcefully, and 
stresses repeatedly later in the same chapter, is that 
the first n measurements of a given quantity gen- 
erated by a particular measurement process provide 
a logical basis for predicting the behavior of further 
measurements of the same quantit}^ by the same 
measurement process if and only if these n measure- 
ments may be regarded as a random sample from a 
^ ^population'' or ^ ^universe'' of all conceivable 
measurements of the given quantity by the measure- 
ment process concerned; that is, in the language of 
mathematical statistics, if an only if the n measure- 
ments in hand may be regarded as '^observed 
values'' of a sequence of random variables charac- 
terized by a probability distribution identified with 
the measurement process concerned, and related 
through the values of one or more of its parameters 
to the magnitude of the quantity measured. 

It should be noted especially that nothing is said 
about the mathematical form of the probability 
distribution of these random variables. The im- 
portant thing is that there be one. W. Edwards 
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Deining has put this cJearl}' and I'orcerully in these 
words: 

''In applying" statistical theory, the main consideration 
is not what the shape of th(^ universe is, but whether there is 
any universe at all. \o univ(M"se can be assumed, nor . . . 
statistical theory . . . applied unless the observations show 
statistical control. In this slate the samples when cumulated 
over a suitable interval of time givc^ a distribution of a par- 
ticular shape, and this shape is reproduced hour after hour, 
day after day, so long as the process remains in statistical 
control — i.e., exhibits the propc^rties of randomness. In a 
state of control, n observations may be regarded as a sample 
from the universe of whatever shape it is. A big enough 
sample, or enough small samples, enables the statistician to 
make meaningful and useful predictions about future samples. 
This is as much as statistical theory can do. 

". . . Very often the experimenter, instead of rushing in 
to apply [statistical methods] should be more concerned 
about attaining statistical control and asking himself whether 
any pr(^dictions at all (the only piu'pose of his (experiment), 
by statistical theory or otherwise, can be made." [Deming 
1950, pp. 502-503.]' 

Shewhart was well aware of the fact that from a 
set of n measurements in hand it is not possible to 
decide with absohite certainty whether they do or 
do not constitute^ a random, sample from some 
definite statistical '^population" characterized by a 
pro])abilitv distribution. He, therefore, proposed 
[Shewhart 1939, pp. 14()-147] that in any particular 
instance one should ''decich^ to act for the present as 
if'^^ the measurements in hand (and tlieir inunediate 
successors) were a simple random sample from a 
definite statistical population — i.e., in the language 
of mathematical statistics, were ' 'observed values'^ 
of independent identically distributed random vari- 
ables — only if the measurements in hand met the 
requirements of the small-samples version of Crite- 
rion I of his previous book [Shewhart 1931, pp. 309- 
318] and of certain achlitional tests of randonniess 
that he described explicitly for the first time in his 
contribution to the University of Pennsylvania Bi- 
centennial Conference in September 1940 [Shewhart, 
1941]. In other words, Shewhart proposed that one 
should consider a measurement process to be — i.e., 
should ''decide to act for the present as if' the 
process were — in a state oj {simple) statistical 
control, only if the measurements in hand show no 
evidence of lack of statistical control when analyzed 
for randonniess in the order in which they were taken 
by the control chart techniques for averages and 
standard deviations that he had found so valuable 
in industrial process control and by certain addi- 
tional tests for randomness based on "runs above 
and below average'' and "runs up and down."^ 



2 This very explicit phraseology is due to John W. Tukey [1960, p. 424]. 

3 Thomas Simpson, in his now famous letter [Simpson 1755] to the President of 
the Royal Society of London "on the Advantas;e of takin^- the Mean of a Xumher 
of Ohservations, in practical Astronomy," was the first to consider repeated 
measurements of a single quantity by a given measurement process as observed 
values of independent random variables having the same probability distribu- 
tion. His conclusion is of interest in itself: 



"U])()n tlH> wliole of which it appears, that the takii)<j 
of observat ions, preatly diminishes the chances for all I li 
off almost all possibility of any great ones: which las! co 
sufficient to I'ecoiiHiicnd the use of the method, ii'»i o 
to all others concoriied in making of exi)erim('iits of an\ !' 
reasoning is equally ap]>lical)le). And the more 



ofthc Meanofannmlicr 
smaller errors, and cuts 
i<i<l<'i-ation , alone, seem 
l\ lo aslroMoinccs, but 
iiwl • lo which, ilic above 
vat ions or cxpciiinents 



there are made, the less will the conclusion be liable to eir. provided they admit 
of being repeated under the same circumstances." 



Simpson'^ did not prove that taking of the Arith- 
metic Mean was the best thing to do but nu^rely 
that it is good. However, in accomplishing this goal 
he did something mncli more im])ortant: he took the 
bold step of regarding errors of measurcMncMit, not as 
unique unrelated jnagnitudes unamenable to mathe- 
matical analysis, but as distributed in accordance 
with a probability distribution that was an intrinsic 
property of the measurement process itself. He 
thus opened the way to a mathematical theory of 
measurement based on the mathematical theory of 
probability; and, in particular, to the formulation 
and development of the Method of Least Squares in 
essentially its present dav form by Gauss (1809, 
1821) and Laplace (1812).*^ 

"Student'; (William Sealy Gosset, 1876-1937), 
pioneer statistical consultant and "father" of the 
"theory of small samples,'' was certainly among the 
first to stress the importance of randomness in 
measurement and experinumtation. Thus, he began 
his revolutionary 1908 paper on "The probable error 
of a mean" with these remarks: 

''Any experiment may be regiirded as forming an indi- 
vidual of a 'population' of expc^riments which might be 
performed under the sam(^ conditions. A series of experi- 
ments is a sample drawn from this jjopulation. 

"Now any series of experiments is only of value in so far 
as it (tables us to form a judgment as to the statistical 
constants of th(^ population to which the experiments be- 
long." [Student 1908, p. 1.] 

None of these writers, nor any of their contem- 
poraries, however, provided "an operationally def- 
inite crit(M'ion that preliminary observations must 
meet" before we take it upon ourselves "to act for 
the present as if" they and their inunediate successors 
were random samples from a "population" or "uni- 
verse" of all conceivable measurenuMits of the given 
quantity by the measurement process concerned. 
Provision of such a critei'ion is Shewhart's nnijor 
contribution. 

Experience shows that in the case of nu^asurement 
processes the ideal of strict statistical control that 
Shewhart prescribes is usually very difficult to 
attain, just as in the case of industrial production 
processes. Indeed, many measurement processes 
simply do not and, it would seem, cannot be made 
to conform to this ideal of producing successive 
measurements of a single quantity that can be 
considered to be "observed values" of independent 
identically distributed random variables.^ The na- 
ture of the "trouble" was stated succinctly by 
Student in 1917 when, speaking of physical and 
chemical determinations, he wrote: 

"After considerable experience I have not encountered 
any determination which is not influenced by the date on 
which it is made; from this it follows that a number of deter- 
minations of the same thing made on the same day are likely 



•1 rookine at the matter from a fundamental viewpoint , tx^iliaps we should 
say, not 1li;)( Slicwlmrt 's i(l(>al of strict statistical control is MP;itl;,in;i]ile in the 
ease (ifsucli in('asin-"ni('nl pi-ocesses, but rather t!iat tli" ';.•■. 'ire of aiMMO^iination 
to tiiis ideal can In' made as close as one choos(>s, if oiv. !■• '/> ;ilni'.' to i)ay the priee. 
In ot liei' words, how close one chooses lo \m\vs a Tiicasmt nnail piocess lo I he ideal 
of strict stalistical control is. in any 5:iven iiistanc(\ basically an ccononnc matter, 
takinu into account , of course, not only \\\o inunediate i)urpose(s) lor wliicli tho 
measurements are intended but also the other uses to which they may be put. 
(Compare Simon [1946, p. 56G] and Eisenhart [1952. p. 554)]. 
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to lie more closely together than if the repetitions had been 
made on different days." [Student 1917, p. 415.] 

In other words, production of measurements seems 
to be like the production of paint; and just as in the 
case of paint, if one must cover a large surface all of 
which is visible simultaneously, one will do well to 
use paint all from the same batch, so in the case of 
measurements, if a scientist or metrologist ^Vishes 
to impress his clients'' he will ^'arrange to do repeti- 
tion analyses as nearly as possible at the same time." 
[Student*1927, p. 155.] 

Fortunately, just as one may blend paint from 
several batches to obtain a more uniform color, and 
one which is, presumably, closer to the ^ ^process 
average," so also may a scientist or metrologist 
^'if he wishes to diminish his real error, . . . separate 
[his measurements] by as wide an interval of time as 
possible" [Student, loc.cit.] and then take an appropri- 
ate average of them as his determination. Consequ- 
ently, if we are to permit such averaging as an allow- 
able step in a fully specified measurement process (see 
sec. 2.6 above), then we are obliged to recognize both 
within -day and between-day components of variation, 
and accept such a complex measurement process as 
being in a state of statistical control overall, or as 
we shall say, in a state of COMPLEX statistical 
control, when the components of within-day and 
between-day variation are both in a state of statis- 
tical control in Shewhart's strict sense, which we 
shall term SIMPLE statistical control. In more 
complex situations, one may be obliged to recognize 
more than two '^layers" of variation, and, some- 
times, more than a single component of variation 
within a given ^^layer." 

Adopting this more general concept of statistical 
control, R. B. Murph}^ of the Bell Telephone Labora- 
tories in his essay ^'On the Meaning of Precision and 
Accuracy" [Murphy 1961], published in advance of 
the issuance by the American Society for Testing 
and Materials of its Tentative Recommended 
Practice with respect to the ''Use of the Terms 
Precision and Accuracy as Applied to Measurement 
of a Property of a Material" [ASTM 1961], remarks: 

^'Following through with this line of thought borrowed 
from quality control, we shall add a requirement that an 
effort to follow a test method ought not to be known as a 
measurement process unless it is capable of statistical control. 
Capability of control means that either the measurements 
are the product of an identifiable statistical universe or an 
orderly array of such universes or, if not, the physical causes 
preventing such identification may themselves be identified 
and, if desired, isolated and suppressed. Incapability of 
control implies that the results of measurement are not to be 
trusted as indications of the physical property at hand — ^in 
short, we are not in any verifiable sense measuring any- 
thing .... Without this limitation on the notion of 
measurement process, one is unable to go on to give meaning 
to those statistical measures which are basic to any discussion 
of precision and accuracy." [Murphy 1961, pp. 264-265.] 

3.2. Postulate of Measurement and the Concept of 
a Limiting Mean 

A conspicuous characteristic of measurement is 
disagreement of repeated measurements of the same 
quantity. Experience shows that, when high accu- 



racy is sought, repeated measurements of the same 
quantity by a particular measurement process does 
not yield uniformly the same number.^ We explain 
these discordances by saying that the individual 
measurements are affected by errors, which we 
interpret to be the manifestations of variations in 
the execution of the process of measurement resulting 
from ^'the imperfections of instruments, and of 
organs of sense," and from the difficulty of achieving 
(or even specifying with a convenient number of 
words) the ideal of perfect control of conditions and 
procedure. 

This ^^cussedness of measurements'^ brings us face 
to face with a fundamental question: In what sense 
can we say that the measurements yielded by a 
particular measurement process serve to determine 
a unique magnitude, when experience shows that 
repeated measurement of a single quantity by this 
process yields a sequence of nonidentical numbers. 
What is the value thus determined? 

The answer takes the form of a postulate about 
measurement processes that has been expressed by 
N. Ernest Dorsey, as follows: 

"The mean of a family of measurements — ^of a number 
of measurements for a given quantity carried out by the 
same apparatus, procedure and observer ^ — approaches a defi- 
nite vahie as the number of measurements is indefinitely 
increased. Otherwise, they could not properly be called 
measurements of a given quantity. In the theory of errors, 
this limiting mean is frequently called the 'true' value, al- 
though it bears no necessary relation to the true quaesitum, 
to the actual value of the quantity that the observer desires 
to measure. This has often confused the unwary. Let us 
call it the limiting mean." [Dorsey 1944, p. 4; Dorsey and 
Eisenhart 1953, p. 103.] 

In my lectures at the National Bureau of Stand- 
ards, and elsewhere, I have termed this — or rather 
a slightly rephrased version of it — the Postulate of 
Measurement. A mathematical basis for it is pro- 
vided by the Strong Law of Large Numbers, a 
theorem in the mathematical theory of probability 
discovered during the present century. See, for 
example, Feller [1957, pp. 243-245, 374], Gnedenko 
[1962, pp. 241-249], or Parzen [1960, p. 420]. 

Needless to say, by a ''family of measurements'' 
Dorsey means, not a succession of ''raw" readings, 
but rather a succession of adjusted or corrected 
values which, by virtue of adjustment or correction, 
can rightfully be considered to be determinations of i 
a single magnitude. | 

a. Mathematical Formulation 

The foregoing can be expressed mathematically 
as follows : on some particular occasion, say the iih, 
we may take a number of successive measurements 
of a single quantity by a given measiu^ement process ^ 
under certain specified circumstances. Let ; 



^Zl> ^22? 



) '^ijy 



(1) 



5 The qualification "when high accuracy is sought" is essential; for if using an 
ordinary two-pan chemical balance we measure and record the mass of a small 
metallic object only to the nearest gram, then we would expect all of our measure- 
ments to be the same — except in the equivocal case of a mass equal, or very nearl- 
equal, to an odd multiple of 3^ g, and such equivocal cases can be resolved easily 
by adding a 3^ g mass to one pan. Full accordance of measurements clearly 
cannot be taken as incontestable evidence of high accuracy; but rather should be 
regarded as evidence of limited accuracy. 
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denote the sequence of measurenients so generated. 
Conceptually at least, this sequence could be con- 
tinued indefinitely. Likewise, on different occasions 
we might start a new sequence, using the same 
measurement procedure and applying it to measure- 
inent of the same quantity under the same fixed 
set of circumstances. Each such fresh ^^start'^ 
would correspond to a different value of i. If, for 
example, the measurement process concerned is sta- 
tistically stable in the sense of being in a state of 
statistical control as defined by Shewhart [1939], then 
the Strong Law of Large Numbers will be applica- 
ble and we may expect the sequence of cumulative 
aritlunetic means on the iili occasion, namely, 

Xin^(Xii+Xi2+. . .-\-Xin)/n, (7^=l, 2, . . .), (2) 

to converge to /x, a number that constitutes the 
limiting mean associated with the quantity meas- 
ured by this measurement process under the cir- 
cumstances concerned, but independent of the '^occa- 
sion,'' that is, independent of the value of ^^i.'^ 
The Strong Law of Large Numbers does not guar- 
antee that the sequence (2) for a particular value 
of "i'^ will converge to pt as tlie number of observa- 
tions n on this occasion tends to infinity, but sim- 
ply states that among the family of such sequences 
corresponding to a large number of different starts, 
(?'=!, 2, . . .), the instances of nonconvergence to /jl 
will be rare exceptions. In other words, if the meas- 
urement process with which one is concerned satis- 
fies the conditions for validity of the Strong Law 
of Large Numbers, then in practice one is almost 
certain to be working w^ith a ''good'' sequence — one 
for wliich (2) would converge to /x if tlie number of 
observations were continued indefinitely — but "bad" 
occasions can occur, though rarely. Thus, tlie Pos- 
tulate of Measurement expresses something better 
than an "on-the-average" property — it expresses an 
"in-almost-all-cases" property. Furthermore, this 
limiting mean /x, the value of whicli each individual 
measurement x is trying to express, can be regarded 
not onlv as the mean or "center of gravity" of the 
infinite conceptual population of all measurements 
X that might conceivably be generated by the meas- 
urement process concerned under the specified cir- 
cumstances, but also as the value of the quantity 
concerned as determined by this measurement 
process. 

b. Aim of the Postulate 

The sole aim of the Postulate of Measurement is 
axiomatic acceptance of the existence of a limit ap- 
proached by the arithmetic mean of a finite number 
n of measurements generated by any measurement 
process as t^-^oo. It says nothing about how the 
"best" estnnate of this limiting mean is to be ob- 
tained from a finite number of such observations. 
The Postulate is an answer to the need of the prac- 
tical man for a justification of his desire to consider 
the sequence of nonidentical numbers that he obtains 
when he attempts to measure a quantity "by the 
same method under like circumstances" as pertaining 
to a single magnitude, in spite of the evident dis- 



cordance of its elements. The Postulate aims to 
satisfy this need by telling him that if lie were to 
continue taking more and still more measurements on 
this quantity "by the same method under like cir- 
cumstances" ad infinitum, and were to calculate 
their cumulative arithmetic means at successive 
stages of this undertaking, then he would find that 
the successive terms of this sequence of cumulative 
arithmetic means would settle down to a narrower 
and ever narrower neighborhood of some definite 
number which he could then accept as the value of 
the magnitude that his first few measurements were 
striving to express. 

c. Importance of Limiting Mean 

The concept of a limiting mean associated with the 
measurement of a given quantity by a particular 
measurement process that is in a state of statistical 
control is important because In^ means of statistical 
methods based on the mathematical theory of prob- 
ability we can make quantitative inferential state- 
ments, with known chances of error, about the magni- 
tude of this limiting mean from a set of measure- 
ments of the given cpiantity by the measurement 
process concerned. The magnitude of the limiting 
mean associated with the measurement of a given 
quantity by a particular measurement process must 
be carefully distinguished from the true magnitude 
of the quantity measured, about which we may be 
tempted to make similar inferential statements. 
Insofar as we make statistical inferences from a set 
of measurements, we make them with respect to a 
property of the measurement process involved under 
the circumstances concerned. The step from quanti- 
tative inferential statements about the limiting mean 
associated with the measurement of a given quantity 
by a particular measurement process, to quantitative 
statements about the true magnitude of the quantity 
concerned, may be based on subject matter knowl- 
edge and skill, general information and intuition — • 
but not on statistical methodologv. ((Compare 
Cochran, Mosteller, and Tukey [1953, pp. ()92-69:Vl.) 

3.3. Definition of the Error of a Measurement, and 
of the Systematic Error, Precision, and Accuracy 
of a Measurement Process 

a. Error of a Single Measurement or Adjusted Value 

The error of any measurement of a particular 
quantity is, by definition, the diflerence between the 
measurement concerned and the true value of the 
magnitude of this quantity, taken positive or nega- 
tive accordingly as the measurement is greater or 
less than the true value. In other words, if x denotes 
a single measurement of a quantity, or an adjusted 
value derived from a specific set of individual measure- 
ments, and r is the true value of the magnitude of 
the quantity concerned, then, by definition, 

the error of x as a measurement of t = x-t. 

The error of any particular measurement or ad- 
justed value, X, is, therefore, a fixed number. The 
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numerical magnitude and sign of this number will 
ordinarily be unknown and unknowable, because the 
true value of the magnitude of the quantity con- 
cerned is ordinarily unknown and unknowable. 
Limits to the error of a single measurement or 
adjusted value may, however, be inferred from (a) 
the precision, and (b) bounds on the systematic 
error, of the measurement process by which it was 
produced — but not without risk of being incorrect, 
because, quite apart from the inexactness with which 
bounds are conmionly placed on the s\^stematic 
error of a measurement process, such limits are 
applicable to the error of a single measurement or 
adjusted value, not as a unique individual outcome, 
but only as a typical case of the errors cliaracteristic 
of measurements of the same quantity that migiit 
have been, or might be, yielded by the same measuie- 
ment process under the same conditions. 

b. Systematic Error of a Measurement Process 

When the Ihnitiug mean /x associated with measure- 
ment of the magnitude of a quantity by a particular 
measurement process does not agree with the true 
value r of the magnitude concerned, the measurement 
process is said to have a systematic error, or bias, of 
magnitude /x-r. 

The systematic error ot a measurement process 
will ordinaril}^ have both constant and variable 
components. Consider, for example, measurement 
of the distance between two points by means of a 
graduated metal tape [Holman 1892, p. 9]. Possible 
causes of s^^stematic error that immediatel}^ come to 
mind are: 

(1) Mistakes in numbering the scale divisions of 

the tape; 

(2) irregular spacing of the divisions of the tape; 

(3) sag of tape; 

(4) stretch of tape; 

(5) temperature not that for which the tape was 

calibrated. 

For any single distance, tlie effects of (1) and (2) 
will be constant; and the effects of (3) and (4) will 
undoubtedly each contain a constant component 
characteristic of the distance concerned. Some of 
these effects will be of one sign, some of the other, and 
their algebraic sum will determine the constant error 
of this measurement process with respect to the 
particular distance concerned. Furthermore, the 
^ ^constant error^^ of tliis measurement process will 
be different (at least, conceptually) for different 
distances measured. 

In the case of repeated measurement of a single 
distance, the effect of (5), and at least portions of 
the effects of (3) and (4), may be expected to vary 
from one ^'occasion'^ to the next (e.g., from day to 
day), thus contributing variable components to the 
systematic error of the process. 

A large fraction of the variable contributions of 
(3) and (4) could, and in practice no doubt would, 
be removed by stretching the tape by a spring balance 
or other means so that it is always under the same 
tension. The stretch corresponding to a particular 
distance would then be nearly the same at all times. 



and a fixed correction could be made for most of the 
sag corresponding to this distance. Furthermore, the 
effect of (5) could, and in practice probably would, 
be reduced by determining the temperature of the 
tape at various points along its length and applying a 
temperature correction. By comparison of the ti^pe 
with a standard, the error arising from (1) could be 
ehminated entirely, and corrections determined as a 
basis for eliminating, or at least, reducing the effect 
of(2)._ 

As in the ioregoing example there are usually 
certain obvious sources of systematic error. Un- 
fortunately, there are generally additional sources 
of systematic error, the detection, diagnosis, and 
eradication of which call for much patience and 
acumen on the part of the observer. The work 
involved in their detection, diagnosis, and eradica- 
tion often far exceeds that of taking the final 
measurements, and is sometimes discouraging to 
the experienced observer as well as to the beginner. 
Fortunately, there are various statistical tools that 
are helpful in this connection, and Olmstead [1952] 
has found that of these the two most effective and 
universally useful are the average {x) and range {R) 
charts of industrial quality control. (For details 
on the construction and use of x- and /^-charts, 
see, for example, the ASTM Manual on Qualitv 
Control of Materials [ASTM 1951, pp. 61-63 and 
p. S3]; or American Standards Zl. 2-1958 and 
Zl. 3-1958 [ASA 1958b, ASA 1958c].) 

c. Concept of True Value 

In the foregoing we have defined the error of a 
measurement x to be ttie difference j-r between tlie 
measurement and the true value r of the magnitude 
of the quantity concerned; and the systernatic error, 
or bias, of a measurement process as the difference 
fjL-T l)etween tlie limiting mean ^t associated with the 
measurement of a particular quantity by the meas- 
urement process concerned, and the true value t of the 
magnitude of tliis quantity. This hnmediately 
raises the question: Just how is the '^true value'' of 
the magnitude ot a particular property of some thing 
defined? In the final analysis, the ''true value'' of 
the magnitude of a quantity is defined by agreement 
among experts on an exemplar method for the measure- 
ment of its magnitude — ^it is the limiting mean of a 
conceptual exemplar process that is an ideal realiza- 
tion of the agreed-upon exemplar method. And tlie 
refinement to which one should go in specif^^ng the 
exemplar process will depend on the purposes for 
which a determination of the magnitude of the quan- 
tity concerned is needed — not just the immediate 
purpose for which measurements are to be taken but 
also the other uses to wdiich these measurements, or a 
final adjusted value derived therefrom, may possibly 
be put. 

Consider, for example, the ''true value" of the 
length of a particular gage block. In our minds we 
envisage the gage block as a rectangular parallel- 
epiped, and its length is, of course, the distance be- 
tween its two "end" faces. But it is practically 
certain that the particular gage block in question is 
not an exact rectangular parallelepiped; and that 



170 



its two 011(1 fju'cs are not pianos, nor even ab- 
solutely smooth surfaoes. Shall we define the ''true 
length" of this gai>o hlock to ho the distance between 
the '^tops'^ of the hi<>-hest ''mountains'' at each end, 
i.e., the distance botwoon the two "outermost points" 
at each end? If so, is this distance to be measured 
diagonally, if necessary, or parallel to the "length- 
wise axis" of the gage block? If the latter, then we 
have the problem of how this "length-wise axis" is 
to be defined, especially in the case of a thin gage 
block whose length corresponds to what would 
ordiiiaril}^ be considered to be its thickness. Or 
shall we be, perhaps, more sophisticated, and en- 
visage a "mean plane" at each end, which in general 
will not be parallel to each other, and define the 
length of this gage block to be the distance between 
two particular points on these planes. If we choose 
the "outermost points" we again have the problem of 
the direction in which the distance is to be measured. 
Alternatively, we might define the length of this 
gage l)lock to be the distance between two strictly 
parallel and conceptually perfect optical flats "just 
touching" the gage block at each end. If so, then 
is the "true distance" between these flats defined in 
terms of wavelengths of light via the techniques of 
optical interferonietry the "true length" of the gage 
block appropriate to the purposes for which the gage 
block is to be used, namely, to calibrate gages and to 
determine the lengths of other objects by mechanical 
comparisons? Furthermore, it is clear, that the 
intrinsic difficulty of defining the "true value" of the 
kiu/th of a particular gage block is not eliminated if, 
instead, we undertake to define the "true value" of 
the dijf'erence in length of two particular gage blocks, 
one of which is a standard, the accepted value of whose 
length is, say, 7n microinches exactly, by industry, 
national or international agreement. 

Similar difficulties arise, of course, in the definition 
of the "true value" of the ma.v.s of a mass standard, 
one of which has ])een resolved by international 
agreement. In defining," the "true value" of the nias.'i 
of a particular metallic mass standard, shall the mass 
of this particular standard be envisaged as the mass 
of its metallic substance alone, relative to the 
International Prototype Kilogram, or as the mass of 
its metallic substance plus the mass of the air and 
water vapor adsorbed upon its surface under stand- 
ard conditions? Tlie difference amounts to about 
45 jug- in the case of a platinum-iridium standard 
kilogram, and becomes critical in the case of 500 
mg standards. The mass of a mass standard is, 
therefore, specified in measurement science to be the 
mass of the metallic substance of the standard plus 
the mass of the average volume of air adsorbed upon 
its surface under standard conditions. Definition of 
the "true value" of the mass of a mass standard, and 
a fortiori, of the difference in mass of two mass 
standards is, therefore, a very complex matter. 

W. Edwards Doming uses the expression "pre- 
ferred procedure" for what we have termed an 
"exemplar method," and very sagely remarks that 
"a preferred procedure is distinguished by the fact 
that it supposedly gives or would give results nearest 
to what are needed for a particular end ; and also by 



the fact that it is more expensive or jnore time 
consuming, or even iin])ossible to carry out," adding 
that "as a preferred procedure is always subject to 
modification or obsolescence, we are forced to 
conclude that neither the accuracy nor the bias of any 
procedure can ever be known in a loqical sensed' 
[Doming 1950, pp. 15-17.] 

It should be evident from the foregoing that the 
"true value" of the magnitude of some property of 
a thing or system cannot be defined with complete 
absolute exactitude. 

As Cassius J. Keys or has remarked, "Absolute 
certainty is a privilege of uneducated minds — and 
fanatics. It is, for scientific folk, an unattainable 
ideal." [Keyser 1922, p. 120.] The degree of refine- 
ment to which one will, or ought, to go in a particular 
instance will depend on the uses for which knowledge 
of the magnitude of the property concerned is needed. 
The "true value" of the length of a piece of cloth in 
everyday commerce is certaiidy a fuzzy concept. 
"Certainly we are not going to specify that the 
clotli sliall be ineasured while suspended horizon- 
tally under a tension of x pounds, at an aml)ient 
temperature of y degrees and a relative humidity of 
z percent" [Simon 1946, p. 654]. On the other hand, 
a moderate degree of refinement is necessary in 
defining the "true lengtli" and "true width" of the 
recessed area in a window sash to which a pane of 
glass is to be fitted. Considerably greater refinement 
is needed in the definition of the "true value" of the 
length of a gage blocik, of the iriass of a mass standard 
or of the frequency of a fi'oquoncv standard — and in 
the last mentioned case there is not today, I under- 
stand, complete agreement among experts on the 
matter. 

Indeed, as is evident from the foregoing, the "true 
value" of the magnitude of a particular quantity is 
intimately linked to the piu-posos for which a value 
of the magnitude of tliis quantity is needed, and its 
"true value" cannot, in the final analysis, be defined 
meaningfully and usefully in isolation from these 
needs. Thoi'ofore, as this fact becomes more widely 
recognized in science and engineering, I hope that 
the traditional term "true value" will be discarded 
in measurement theory and practice, and replaced 
by some more appropriate term such as "target 
value" ^ that conveys the idea of being the value 
that one would like to obtain for the purpose in 
hand, without any implication that it is some sort 
of permanent constant preexisting and transcending 
any use that we may have for it. I have retained 
the traditional expression "true value" in the sequel 
because of its greater familiarity, but shall always 
mean by it the relevant "target value." 

6 "We admit the existence of systematic error— of a difference between the 
quantity measured (the measured quantity) and the quantity of interest (the 
target quantity). We ask the observations about the measured quantity. We 
ask our subject matter knowledge, intuition, and general information about the 
relation between the measured quantity and the target quantity." [Cochran, 
etal. 1954. p. 33.] 

" Some people prefer the term 'true value', although others excoriate 

it as philosophically unsound. 

"We could also call the reference level a 'target value'. In a way tliis is a 
bad term because it implies that it is something we want to find through the 
measurement process rather than something we ought to fuid In^cause, like Mt. 
Everest, it is there. Unfortunately our desires can influence onr notion of what 
is true, and we can even unconsciously l)ring tlu; latter into agi(H'ment with the 
former; my use of the term 'target value' is not meant to imply that I think it 
legitimate to equate what we would like to see with what is there." [Murphy 
1961, p. 265.] 
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d. Concepts of the Precision and Accuracy of a Measurement 
Process 

By the precision of a measurement process we 
mean the degree of mutual agreement characteristic 
of independent measurements of a single quantity 
yielded by repeated applications of the process under 
specified conditions; and by its accuracy the degree 
of agreement of such measurements with the true 
value of the magnitude of the quantity concerned. 
In other words, the accuracy of a measurement proc- 
ess refers to, and is determined b}^ the degree of 
conformity to the truth tliat is characteristic of inde- 
pendent measurements of a single quantity produced 
(or producible) by the repeated applications of the 
process under specified conditions; whereas its preci- 
sion refers solely to, and is determined solely by the 
degree of conformity to each other characteristic of 
such measurements, irrespective of whether they 
tend to be close or far from the truth. Thus, accu- 
racy has to do with closeness to the truth; precision, 
only with closeness toaether 

This distinction between the meanings of the 
terms ^ ^accuracy" and ^'precision'' as applied to 
measurement processes and measuring instruments 
is consistent with the etymological roots of these 
words. ^^Etymologically the term 'accurate' has 
a Latin origin meaning 'to take pains with' and refers 
to the care bestowed upon a human effort to make 
such effort what it ought to be, and 'accuracy' in 
common dictionary parlance implies freedom from 
mistakes or exact conformity to truth. 'Precise,' on 
the other hand, has its origin in a term meaning 
'cutoff, brief, concise'; and ^precision' is supposed 
to imply the property of determinate limitations 
or being exactly and sharply defined." [Shewhart 
1939, p. 124.] Thus one can properly speak of a 
national, state, or local law as being "precise," but 
not as being "accurate" — to what truth can it 
conform? On the other hand, if one spoke of a 
particular translation as being "accurate" this 
would imply a high degree of fidelity to the original 
"attained by the exercise of care." Whereas, to 
speak of it as being "precise," would imply merely 
that it is unambiguous, without indicating whether 
it is or is not correct.^ 

In spite of the distinct difference between the 
etymological meanings of the terms "accuracy" 
and "precision," they are treated as synonyms in 
many standard dictionaries; and Merriam-Webster 
[1942], after drawing the helpful distinctions quoted 
in the foregoing footnote, promptly topples the 
structure so carefully built by adding "scrupulous 
exactness" as an alternative meaning of "precise." 
Consequently it is not surprising that "There are 
probably few words as loosely used by scientists 
as precision and accuracy. — It is not unusual to 
find them used interchangeably in scientific writ- 
ings." [Schrock 1950, p. 10.] 

7 It is sometimes helpful to distinguish between "correct," "accurate," and 
"exact": "CORRECT, the most colorless term, implies scarcely more than 
freedom from fault or error, as judged by some (usually) conventional or acknowl- 
edged standard; . . . ACCURATE implies, more positively, fidelity to fact 
or truth attained by the exercise of care; . . . EXACT emphasizes the strictness 
or rigor of the agreement . which neither exceeds nor falls short of the fact , standard 
or truth; , . . PRECISE stresses rather sharpness of definition or delimita- 
tion ..." [Merriam-Webster 1942 p. 203]. 



On the other hand, as Shewhart has remarked: 

''Careful writers in the theory of errors, of course, have 
always insisted that accuracy involves in some way or other 
the difference between what is observed and what is true, 
whereas precision involves the concept of reproducibility of 
what is observed. Thus Laws, writing on electrical measure- 
ments, says: ^ 'Every experimenter must form his own 
estimate of the accuracy, or approach to the absolute truth 
obtained t)y the use of his instruments and processes of 
measurement. He must remember that a high precision, 
or agreement of the results among themselves, is no indication 
that the quantity under measurement has been accurately 
determined.' As another example we may take the following 
comment from a recent and authoritative treatise on chemical 
analysis: ^ 'The analyst should form the habit of estimating 
the probable accuracy of his work. It is a common mistake 
to confuse accuracy and precision. Accuracy is a measure 
of the degree of correctness. Precision is a measure of 
reproducibility in the hands of a given operator.' '' [Shewhart 
1939, pp. 124-125.] 

More recently, Lundell, Hoffman, and their associates 
at the National Bureau of Standards have re- 
emphasized the importance of the distinction between 
'^precision" and ^'accuracy'': 

'Tn discussions of chemical analysis, the terms precision 
and accuracy are often used interchangeably and therefore 
incorrectly, for precision is a measure of reproducibility, 
whereas accuracy is a measure of correctness. The analyst 
is vitally interested in both, for his results must be sufficiently 
accurate for the purpose in mind, and he cannot achieve 
accuracy without precision, especially since his reported 
result is often based on one determination and rarely on more 
than three determinations. The recipient of the analysis 
is interested in accuracy alone, and only in accuracy suffi- 
cient for his purposes." [Hillebrand et al., 1953, p. 3.] 

It is most unfortunate that in everyday parlance 
we often speak of ^ ^accuracy and precision,'^ because 
accuracy requires precision, but precision does not 
necessarily imply accuracy. 

'Tt is, in fact, interesting to compare the measurement 
situation with that of a marksman aiming at a target. We 
would call him a precise marksman if, in firing a sequence of 
rounds, he were able to place all his shots in a rather small 
circle on the target. Any other rifleman unable to group his 
shots in such a small circle would naturally be regarded as 
less precise. Most people would accept this characteriza- 
tion whether either rifleman hits the bull's-eye or not. 

'^Surely all w'ould agree that if our man hits or nearly 
hits the bull's-eye on all occasions, he should be called an 
accurate marksman. Unhappily, he may be a very precise 
marksman, but if his rifle is out of adjustment, perhaps the 
small circle of shots is centered at a point some distance from 
the bull's-eye. In that case we might regard him as an in- 
accurate marksman. Perhaps we should say that he is a 
potentially accurate marksman firing with a faulty rifle, 
but speaking categorically, we should have to say that the 
results were inaccurate." [Murphy 1961, p. 265.] 

It follows from what has been said thus far that 
''if the precisions of two processes are the same but 
the biases are different, the process of smaller bias 
may be said to have higher accuracy while if the 
biases are both negligible, the process of higher pre- 
cision may be said to have higher accuracy.^' Un- 
fortunatelv, ^'in other cases such a simple comparison 
maybe impossible.'^ [ASTM 1961, p. 1760.] 



9 Frank A. Laws, Electrical Measurements, p. 593 (McGraw-Hill, New York, 
N.Y.,1917). 

9 G. E. F. Limdell and J. I. Hoffman. Outlines of Methods of Chemical 
Analysis, p. 220 (John Wiley and Sons, New York. N.Y., 1938). 
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To fully appreciate the preceding statement — and 
especially the difficulty of comparing accuracies 
in some cases — ^let us consider figures 1 and 2, in 
which the origins of the scales correspond to the 
true value of r of the quantity measured, so that 
the curves shown may be regarded as depicting the 
distributions of errors of the measurements yielded 
by a selection of different measurement processes. 
Consider first the three symmetrical distributions 
in the top half of figure 1. All three of these dis- 
tributions are centered on zero, so that these meas- 
urement processes have no bias. It is evident 
that the process of highest precision, c, is also the 
process of highest accuracy; and that the process of 
least precision, a, is also the process of least accuracy. 
Since curve b in the upper half of figure 1 and curve 
d in the lower half have identical size and shape, 
the corresponding processes have the same precision; 
but process b is without bias, whereas process d 
has a positive bias of two units, so that process b 
is clearly the more accurate. (In particular we may 
note that whereas it is practically certain that 
process b will not yield a measurement deviating 




Figure 1. Distrihidions of errors of so7ne biased and unbiased 
measurement processes of various precisions. 



from the truth by more than two units, exactly 
one-half of the measurements yielded by process d 
will deviate from the truth by this much or more.) 
Similar remarks clearly apply to processes c and e 
corresponding to curve c in the upper half and curve 
e in the lower half of figure 1, but in this instance the 
superiority of process c relative to process e with 
respect to accuracy is even more marked. (In 
particular, we may note that whereas it is practically 
certain that no measurement yielded by process c 
will deviate from the truth by as much as one unit, 
it is practically certain that every measurement 
yielded by process e will deviate from the truth by 
more than one unit.) 

Figure 2, which is essentially the same as one given 
by General Simon [1946, fig. i], portrays three meas- 
urement processes A, B, and (7, differing from each 
other with respect to both precision and bias. 
Comparison of these tliree processes with respect to 
accuracy is not quite so simple. First, it is evident 
that, altliough process A has greater precision than 
process B, process B is the more accurate of the two. 
(In particular, it is practically certain that none of 
the measurements yielded by process B will deviate 
from the truth by more than 4 units, whereas 50 
percent of the ineasurements from process A will 
deviate from the truth by four units or more.) 
Next, is process B more (or less) accurate than process 
C which is unbiased, but has a very low precision'^ 
Process B has a positive bias of two units, but has 
sufficiently greater precision than process C to also 
have greater accuracy than process C. (While 
approximately 50 percent of the measurements 
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other with respect to both precision and accuracy. 
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yielded by process C will deviate from the truth by 
more than two units (in either direction), and ex- 
actly 50 percent of the measurements yielded by 
process B will deviate from the truth by two units 
or more (in the positive direction only), it cannot 
be ignored that about 10 percent of the measure- 
ments yielded by process C will deviate from the 
truth by four units or more whereas it is practically 
certain that no measurement yielded by process B 
will deviate from tlie truth by as much as four units.) 
Similarly, it may be argued that process A, in spite 
of its bias, has greater accuracy than process C 
^ 'since the range in measurements of C more than 
covers the corresponding ranges of A or BJ^ [Simon 
1946, p. 654.] While this conclusion that of the 
three measurement processes depicted in figures 2, 
process C has the least accuracy, may not be entirely 
acceptable to some persons, it is consistent with 
Gauss' dictum, in a letter to F. W. Bessel, to the 
effect that maximizing the probability of a zero error 
is less important than minimizing the ' 'average'' 
injurious effects of errors in general. [C. F. Gauss, 
1839, pp. 146-147.] 

Before leaving figure 2, we must not fail to join 
General Simon in remarking that ''the average of a 
large number of measurements from [process] C will 
be more accurate than a similar average from either 
A or 5" [Simon 1946, p. 654]. This point is actually 
illustrated in our figure 1 : the three curves in the top 
half of figure 1 portray the distributions of errors of 
single measurements (curve a) of averages of 12 
measurements (curve b) and averages of 144 nieasure- 
ments (curve c) from process C; and curves d and e 
in the lower half show the distributions of errors of 
individual measurements (curve d), and of averages 
of 12 measurements (curve e) from process B, 
respectivel^^ It is evident that averages of 12 
measurements from process C (curve b in upper 
portion of fig. 1) have not only greater accuracy than 
individual measurements from process B (curve d in 
lower portion of the figure), but also greater accuracy 
than averages of 12 measurements from process B 
(curve e in lower portion) . 

On the other hand, it is obvious that, if our choice 
is between individual measurements from process C 
(curve a) and averages of 12 measurements from 
process B (curve e), the latter will clearly provide 
greater accuracy. In brief, a procedure with a small 
bias and a high precision can be more accurate than an 
unbiased procedure of low precision. It is important 
to realize this, for in practical life it is often far better 
to alwa^^s be c{uite close to the true value than to 
deviate all over the place in individual cases but 
strictly correct "on the average," like the duck 
hunter who put one swarm of shot ahead of the duck, 
and one swarm behind, lost his quarry, but had the 
dubious satisfaction of knowing that in theory he 
had hit it ' 'on the average . ' ' This we must remember : 
in practical life we rarely make a very large number 
of measurements of a given type — we can't wait to 
be right on the average — our measurements must 
stand up in individual cases as often as possible. 

Despite the foregoing, freedom from bias, that is, 
freedom from "large" bias, is a desirable character- 



istic of a measurement process. After all we want 
our measurements to yield us a determination that 
we can use as a substitute for the unknown value of a 
particular magnitude whose value we need for some 
purpose — we don't want a determination of the 
value of some other magnitude whose relation to the 
one we need is indefinitely known. 

In view of the difficulty of comparing with respect 
to accuracy measurement processes that differ both 
in bias and ptrecision, some writers have elected to 
take the easy way out by defining "accuracy" to be 
equivalent to absence of bias, saying that of two 
measurement processes having different biases, the 
process of smaller bias is the more "accurate" 
regardless of the relation of their respective precisions. 
(See, for example, Beers [1953, p. 4], Ostle [1954, p. 4], 
and Schenck [1961, p. 4, p. 14].) While the adoption 
of this concept of "accuracy" certainly makes the 
discussion of "accuracy" and "precision" simpler for 
the authors concerned, tliis practice is contrary to 
the principle of "conservation of linguistic resources," 
as R. B. Murphy puts it, adding: ^'It seems to me 
that the terms 'bias' and 'systematic error' are 
adequate to cover the situation with which they are 
concerned. If, nevertheless, we add the term 
'accuracy' to apply agahi in this restricted sense, 
we are left wordless — at the moment at least — when 
it comes to the idea of over-all error. From the 
point of view of the need for a term it is hard to 
defend the view that accuracy should concern itseh* 
solely with bias. . . . [and] there is overwhelming 
evidence that we need a term at least for the concept 
of over-all error." [Murphy 1961, pp. 265-266.] 

3.4. Mathematical Specification of the Precision of 
a Measurement Process 

a. Simple Statistical Control 

Let us now consider the mathematical definition 
of the precision of a measurement process under a 
fixed set of circumstances. By definition, the pre- 
cision of a measurement process has to do with the 
"closeness together" that is typical of successive 
measurements of a single quantity generated by 
applications of the process under these fixed condi- 
tions. Otherwise expressed, it has to do with the 
typical "closeness together" of the two individual 
measurements constituting an arbitrary pair. If the 
expression ''typical 'closeness together' " is to be 
meaningful, the measurements generated by repeated 
application of the process to the measurement of a 
single quantity must be homogeneous in some sense. 
Therefore, for the moment, let us assume that the 
measurement process is in a state of simple statistical 
control, so that the successive measuremets in each 
of the sequences (1), (i=l, 2, 3, . . .), generated by 
the process may all be regarded as "observed" values 
of independent identically distributed random variables. 

Just as we may regard each individual measure- 
ment Xij in a particular sequence (1) as striving to 
express the value of the limiting mean /z, so also we 
may regard each individual difference Xij—Xu:, J9^k, 
as striving to express the characteristic spread 
between an arbitrary pair of measurements, x^ and 
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x" ^ say. For ibis purpose tho signs of these differ- 
ences tire clearly irrelevant. Therefore, by analogy 
with our use of a sequence of cumulative arithmetic 
means, (2), to achieve a mathematical formulation 
of the concept of a limiting mean associated with 
measurement of a given quantity by a particular 
measurement process, let us adopt the sequence of 
cumulative arithnu^tic means of the squares of the 
n{n—l)/2 distinct differences among the first n 
measurements of a particuhir sequence (1), for 
example, the sequence 



{d Jin — -J- 7T Zu Z^ \^ij~^ik) 

n{n—l)j=i k=j+i 



(n-2,3...), (3) 



as the basis of a matlienuitical formulation of the 
concept of the precision of a measurement process. 
The necessary and sufficient condition for almost 
sure convergence of the sequence (3) to a finite limit, 
say A", is that the ^Strong Law of Large Nund)(M-s bo 
applicable to the sequence. 

consisting of the squares of the correspoiuH ng terms 
of the original sequence (1). (Boundedness of the 
a:'s in addition to statistical control is, for example, 
sufficient to ensure tliat tlie sequence (4) will also 
obey the Strong Law of Large Nund)ers.) If the 
Strong Law of Large Numbers is applicable to the 
sequence of squares (4), and if the measurement 
process is in a state of simple statistical control, 
then the cumulative arithmetic means of the squares 
of the measurements, that is, the sequence 






{n=l,2,.,.), 



(5) 



will almost surely tend to a limit, say S, the magni- 
tude of which will depend on the quantity measured, 
the measurement process involved, but not on the 
'^occasion" (identified by the subscript '^i"). By 
virtue of an algebraic identity that is well known 
to students of matliematical inequalities, namely, 
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n / n 



aj 



1 n n 
^ j=l k=] 



{n>2) (6) 



and of the fact that the right-hand side of (6) is 
always positive except when the a's are all equal, 
it is easily seen, on dividing both sides of (6) by 
71^, that S will always exceed m^, the square of the 
(almost sure) limit of the sequence (2), so that we 
may write S=^^-\-(t^, with (7^>0. Furthermore, 
applying the algebraic identity (6) in reverse to 
the right-hand side of (3) yields the following rela- 
tionship between the corresponding terms of se- 
quences (3), (5), and (1): 



i^^')in = '^ (^) { {X^)in-{~^iny\>0, 



(n>2). 
(7) 



Hence, if a m(nisur(Mnent process is in a state of 
simple statistical <*ontrol and the Strong Law of 
Large Numbers is applicable to a seque nce o f squared 
measurements (4), then the sequence (rf^),,,, defined 
by (3), will, in view of (7), tend almost surely to a 
finite limit A^=2a^. Thus we see that o-^, termed 
the variance of the measurement process, is the mean 
value of one-half of the squared difference between 
two arbitrarv measurements x^ and x^\ that is, 



a'=i{x'-x'y 



(8) 



and provides an indication of the mprecision of the 
process. The square root of the variance, a, is 
termed the standard deviation of the process. 

It is natural, therefore, on the basis of a single 
sequence of n measurements of a single quantity, 
to take 



z n\ri—v) j=^ k=j+\ n — 1 



1 nr. 



(9) 

as the sample estinuite of the underlying variance 
a'; and the square root, s, as the sample estimate 

0f(7.^« 

From (9), since x = x,i tends (almost surely) to /i 
it is evident that o-^ is also the mean value of the 
squared deviations of individual measurements from 
the lim iting nunin /x of the process, that is a^= 
{x—^xf, so that the standard deviation a inay be 
regarded, in the language of mechanics, as^ the 
radius of gvration of the clistribution of all possible 
measurements x about ix, the limiting nunin of the 
process. 

Remark: Mathematically the foregoing discussion 
can be carried out equally well in tei'ins of the 
absolute (unsigned) values of the difi'ereiu^es instead 
of in terms of their squares. Such an approacli is, 
inathematically speaking, somewhat jnore general 
in that it requires for its validity merely that the 
Strong Law of Large Numbers be applicable to the 
sequence |xzi|, l^^isl, • • ., I^^^l, . . . oi absolute values 
of the Xij rather than to the sequence (4) of their 
squares. From the practical viewpoint, however, 
this greater generality is entirely illusory, and the 
mathematics of absolute values of variables is 
alwa^^s more cumbersome than the mathematics of 
their squares. For example, the arithmetic mean 
of the absolute values of the n(n—l)/2 distinct 
differences among n measurements, i.e., 



\dU^ ^ 



n(ft-i)S,5i 



Xk\ 



(10) 



10 From the algebraic identity (6) , it is evident that the practice in some circles 
n 
of dividing 2^(:c-")2 by n, instead of n-1, amounts to including each of the 

j=l 
distinct squared diflerences (x}-Xk)^,j^k. twice in the summation , together with 
n identically zero tonus (rj—Xk)Kj=k. each included once, and then dividing by 
7? 2. the total nuinb(>r of terms (real and phantom") involved. Viewed in this 
light it would seem that division by n—1 is more rcasouable, in that the inclusion 
of identically zero terms in the formulation of a measure of variation is a bit un- 
reasonable. 
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is not expressible as a multiple of the sum of the 
absolute deviations of the measurements from their 
mean, ^\Xi—x\, and for large values of n the 
evaluation of (10) presents computational difficulties . 
The approach in terms of the absolute values of 
the differences also has the disadvantage from the 
practical viewpoint that, as we shall see in a moment, 
components of imprecision are additive in terms of 
squared quantities such as a'^, so that in this sense 
the variance o-^ is a more appropriate measure of the 
dispersion of the x's about their limiting mean jjl 
than is a itself. 

Ordinarily, the magnitude of o-^ (and, hence, of cr), 
unlike that of fx, depends onl}^ on the measurement 
process concerned and the circumstances under 
which it is apphed, and not also on the magnitude 
of the quantity measured — otherwise we could not 
speak of a measurement process having a variance, 
or a standard deviation. 

Since the precision of the process obviously 
decreases as the value of a (or, of a^) increases, and 
vice versa, it is necessary to take some inverse func- 
tion of 0- as a measure of the precision of process. 
To conform with traditional usage it is necessary 
to regard the precision of a measurement process as 
inversely proportional to its standard deviation a 
which is, therefore, a measure of the imprecision of 
the process. Thus, Gauss, writing in 1809, remarked 
that his constant h=l/a^l2 could properly be con- 
sidered to be a measure of the precision of the 
observations because if, for example h^=2h, that is, 
if y = ^a, then ''a double error can be committed 
in the former system with the same facility as a 
single error in the latter, in which case, according 
to the common way of speaking, a double degree of 
precision is attributed to the latter observations.'' ^^ 

The fact of the matter is, however, that: 

". . . different fields have particularly favorite ways 
of expressing precision. Most of these measures are multiples 
of the standard deviation; it is not always clear \vhich multi- 
ple is meant. . . . 

''Some consider it unfortunate that precision should be 
stated as a multiple of standard deviation, since precision 
should increase as standard deviation decreases. Indeed, 
it would be more exact to say that standard deviation is a 
measure of imprecision. However, sensitivity, as we have 
previously indicated, suffers from this logical inversion 
without hurt. Perhaps we can best avoid this by saying 
that standard deviation is an index of precision. The habit 
of saying 'The precision is ... ' is deeply rooted, and 
there would be understandable impatience with the notion 
that standard deviation should be numerically inverted 
before being quoted in a statement of precision." [Murphy 
1961, pp. 266-267.] 

In consequence the ASTM has, at least tentatively, 
taken the following position: 

"The numerical value of any commonly used index of 
precision will be smaller the more closely bunched are the 
individual measurements of a process. As more causes are 
added to the system, the greater the numerical value of 
the index of precision will ordinarily become. If the same 
index of precision is used on two different processes based 



11 " Ceterum constans h tamqiiam mensura praecisionis observationum con- 
siderari poterit. . . . Quodsi igitur e.g., h'=2h. aeqiie facile in systematc priori 
error duplex committi poterit, ac simplex in posteriori, in quo casu observation! 
ibus posterioribus secundum vulsarem loquendi morem praecisio duplex tri- 
buitur." [Gauss 1809, Art. 178; 1871, p. 233; English translation. 1857, pp. 259-260.] 



on the same method or intended to measure the same physical 
property, the process that has the smaller value of the index 
of precision is said to have higher precision. Thus, although 
the more usual indexes of precision are really direct measures 
of imprecision, this inversion of reference has been firmly 
established by custom. The value of the selected index of 
precision of a process is referred to simply as its precision or 
its stated precision." [ASTM 1961, p. 1759.] 

As we have remarked previously, in practical work 
the end result of measuring some quantity or cali- 
brating an instrument for a standard rarely consists 
of a single measurement of the quantity of interest. 
More often it is some kind of average or adjusted 
value, for example, the arithmetic mean of a number 
of independent measurements of the quantity of 
interest. Let us, therefore, consider the statistical 
properties of a sequence of arithmetic means of 
successive nonoverlapping groups of n measurements 
each from a sequence (1) of individual measurements 
^delded by a measurement process on a particular 
occasion. In other words, let us consider the 
sequence 



Xiij X}2, • • .J Xfffiy . . 



(n) 



of distinct arithmetic means of n measurements each 



Xim — ^ y I Xijj [m — 1, Z, . . .), 



'• j=('m — i)n-\-l 



(12) 



derived from a sequence (1) of individual measure- 
ments of a single quantity produced, or at least 
conceptually producible, by the measurement process 
concerned on, say, the 'ith occasion. If the ' 'under- 
lying measurement process^' giving rise to the indi- 
vidual measurements Xij is in a state of simple 
statistical control, then the ^'extended measurement 
process'^ giving rise to the averages Xim will also be 
in a state of simple statistical control. Conse- 
quently, the mathematical analysis of section 3.2, 
but with the averages Xi^ in place of the individual 
measurements Xij, will carry through without other 
change. Let /Xx denote the limiting mean thus 
associated with the ^ ^extended measurement process' ' 
giving rise to the averages Xim as its ^^individuar^ 
measurements. Since the cumulative arithmetic 
mean of the first m terms of the sequence (11) is 
the same as the cumulative arithmetic mean of the 
first mn terms of the sequence (1) of individual 
measurements, it is clear that the limiting mean 
jLtx associated with the sequence of averages (11) is 
the same as the limiting mean associated with the 
original sequence (1) of individual measurements, 
that is, 

fi- = fX:, = fX. (13) 

Similarly, the mathematical analysis at the 
beginning of the present section, but with the in- 
dividual measurements Xij in (3) thru (9), replaced 
by the averages x^^, carries through essentially as 
before. Let a^ denote the variance thus associated 
with the ^'extended measurement process' ' giving 
rise to the sequence of averages (11). As in the 
case of the variance o-^ of individual measurements, 
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so also juay a^ be intcrpretinl as the overall mean 
value of the squared deviation of ''individual" 
averages x from the limitino: mean /x, of the ''ex- 
tended process/^ that is, 



By virtue of the algebraic identity 
(i-.)^'-ggx.-.J^[i|(a:.-.)J 

I n n n — l n n 

it is readilv seen that 



(14) 



(15) 



n n 



(16) 



^ (The mean value of a sum is always the sum of the 
mean values of its individual terms, so that the 
overall mean value of the first summation inside the 
brackets in the last line of (15) is simjdy nal. Fur- 
thermore, in the case of independent identically 
distributed measurements, the overall ]nean value 
of the term involving the double summation is 0.) 

^ Since, from (16), (T^=af-y/n, it is seen that the 

' precision of the arithmetic mean of 7i inde/pendent 
measurements is proportional to -yln. Hence the 
arithmetic mean of 4 in(h^pendent measurements 
has double the precision of a single measurement; 
the mean of 9 independent jneasurements, thrice the 
precision of a single measurement; and m inde- 
pendent measurements will be required if their 
arithmetic mean is to have a 12-fold increase in 
precision over a single measurement. (But to ask 
for a 12-fold increase in j)recision is to ask for a very 
considerate improvement indeed, as can be seen 
from a compai'ison of curves a and e in the top half 
of fig. 1.) 

To serve as a reminder of the distinction between 
the standard deviation of an individual measurement 

^ and the standard deviation of a mean x, it is cus- 
tomary to refer to a as the "standard deviation'^ of 
a single measurement x, and to o-j as the "standard 
error'' of the (arithiuetic) mean x. 

b. Within-Occasicns Control 

In the foregoing it has been assumed that the 
individual measurements comprising the sequences 
(1) corresponding to the respective "occasions," 
('^=1,2, . . .), could all be regarded as "observed 
values" of independent identically distributed ran- 
dom variables, that is, that the measurement process 
concerned was in a state of simple statistical control. 
When such is the case then any subset of n measure- 
ments is strictly compara])le to any other subset of 
n measui'ements, and jiny two such subsets can be 
combined and imai'ded validly as a single set of 2/i 



measurements. Unfortunately, as Student's com- 
ment quoted on page 167 above clearly implies, 
such complete homogeneity of measurement is rarely 
if ever met in practice. More often the situation is 
as described by Sir George Biddell Airy, British 
Astronomer Royal 1835-1881, in (to my knowledge) 
tlie first elementary book on the theory of erroi-s and 
combination of observations in the English lamznage 
[Airy 1861, p. 92]: 

''When successive series of observations are made, day 
after day, of the same measurable quantity, which is either 
invariable ... or admits of being reduced by calculation to 
an invariable quantity . . .; and when every known instru- 
mental correction has been applied . . .; still it will sometimes 
be found that the result obtained on one day differs from the 
result obtained on another day by a larger quantity than 
could have been anticipated. The idea then presents itself, 
that possibly there has been on some one day, or on every 
day, some cause, special to the day, which has produced a 
Constant Error in the measures of that day." 

Sir George, however, cautions against jumping to 
conclusions on the basis of only a few observations: 

"The existence of a daily constant error . . . ought not 
to be lightly assumed. When observations are made on 
only two or three days, and the number of observations on 
each day is not extremely great, the mere fact, of accordance 
on each day and discordance from day to day, is not sufficient 
to prove a constant error. [And we should interject here 
that under such circumstances apparent over-all accordance 
is not sufficient to i^rove the absence of daily constant errors 
(Mther.l The existences of an accordance analogous to a 

'round of luck' in ordinary changes is s\ifficiently probable 

More (extensive experience, however, may give gr(Miter confi- 
dence to the assumption of constant errors . . . first, it ought, 
in general to be established that there is possibility of error, 
constant on one day but varying from day to day. ..." 
[Airy 1861, p. 93.] 

The most useful statistical tools for this purpose 
are the control-cliart techniques of the industrial 
quality control engineer. If in such a situation, a 
series of measurements obtained by measurement of 
a single quantity a nund)er of times on each of sev- 
eral different (hiys or '^occasions'' by a particular 
measurenuMit process is plotted in the form of a 
control chart for individuals [AS1\M 1951, pp. 76-78, 
and pp. 101, 105], the indivichnil measiUTments so 
plotted will be seen to consist of ^'sections" identi- 
fiable with tlie subsequences (1) corresponding to the 
respective ^'occasions," (7.= 1,2, 3, . . .), with the 
measurements witliin sections pair-wise closer to- 
gether on the average than two measurements one 
of which comes from one section and the other from 
another. Such a series of measurements is clearly 
''out of control." If now parallel x- and i?-charts 
are constructed from these data, based on a series of 
samples of equal size from within the respective "oc- 
casions' ' or "sections" only, i.e., excluding means 
X and ranges R of any samples that "straddle" two 
occasions, and the points on the resulting a:-chart 
are clearly "out of control," then we may infer the 
existence of day-by-day comnonents of error, con- 
stant, perhaps, on one day, but varying from day 
to day. 

If points on the /?-chart constructed as described 
arc "out of control" also, then the measurement 
operation concerned is in a completely unstable con- 
(Ution and cannot be described validly as a "measure- 
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merit process' ' at all. On the other hand, if the 
5-chart is ^^out of control/' bub the S-chart is "in 
control/' then we may regard the measurement 
process as being in a state of within-occasions control. 
("It is usually not safe to conclude that a state of 
control exists unless the plotted points for at least 
25 successive subgroups fall within the 3-sigma con- 
trol limits. In addition, if not more than 1 out of 
35 successive points, or not more than 2 out of 100, 
fall outside the 3-sigma control limits, a state of 
control may ordinarily be assumed to exist." [ASA 
1958c, p. 18.]) In such a situation we postulate the 
existence of (at least, conceptually) different limiting 
means jLti for the respective "occasions" (i=ly 2, . .), 
and a common within-occasions variance o-^. 

An unbiased estimate of the vjithin-occasions stand- 
ard deviation a^ can be obtained, if desired, from the 
average range R used in constructing the i?-chart, 
by means of the formula 



unbiased estimate of (Tj^=R/d2 



(17) 



where d2 is the factor given in the do column of table 
B2 of [ASTM 1951, p. 115] corresponding to the 
sample or subgroup size n used in constructing the 
i?-chart. 

Alternatively, if desired, an unbiased estimate of 
a% can be obtained directly from the measurements 
involved by means of the formula 



2^ ^ \^hj ^h) 

unbiased estimate of <Ti=s^ ^^ ^^l r^ ' 

rc [n 1 ) 



(18) 



where x^ denotes the jth measurement and Xj^ the 
arithmetic mean of the n measurements of the Ath 
subgroup, respectively, and k is the number of sub- 
groups involved in constructing the /?-chart. 

c. Complex or Multistage Control 

When a measurement process is not in a state of 
simple statistical control that satisfies the criteria of 
within-occasions control, that is, when the i-chart 
(and control chart for individuals) are clearly "out 
of control," but the 25 or more subgroup ranges 
plotted on the i?-chart exhibit control, then it is usu- 
ally of importance to ascertain whether the meas- 
urement process concerned is possibly in a state of 
complex or multistage statistical control. For this 
purpose four or more measurements from each of at 
least 25 different occasions will be needed. Taking 
one sample of n successive measurements, (4<7i< 
10), from the available measurements corresponding 
to each of, say, ^(>25) differeiit "occasions," eval- 
uate the arithmetic means Xi of these samples, 
(i=l, 2, . . ., k), and treating these averages as IN- 
DIVIDUAL measurements construct a control chart 
for these "individuals" and parallel x- and i?-charts 
as described in [ASTM 1951, Example 22, p. 101]. 
If the points plotted on these three control charts 
exhibit control, then we "act for the present as if" 



the measurement process concerned is in a state of 
complex or multistage statistical control and regard the 
limiting means ^i for the respective "occasions," 
(i=l, 2, . . .) as being in a state of simple statistical 
control with a limiting mean [x and variance alj 
termed the between -occasions component of variance. 

If in such a situation we were to form cumulative 
arithmetic means such as (3) of the squares of all 
distinct differences between arbitrary pairs of meas- 
urements from within each of the respective "occa- 
sions," then such cumulative arithmetic means of 
squares of differences would almost surely tend to 
2(7^ in the limit as the number of pairs included tends 
to infinity, where d^ is the "within-occasions vari- 
ance" mentioned above in connection with "within- 
occasions control." If, on the other hand we were 
to form similar cumulative arithmetic means of the 
squares of differences between arbitrary pairs con- 
sisting in each instance of one measurement from 
each of two different sections, then such a cumula- 
tive arithmetic mean of squared differences would 
tend almost certainly to 2((Tl + al) as the number 
of "occasions'^ sampled tends to infinity, where at is 
the above mentioned "between-occasions variance," 
i.e., the variance of tlu^ limiting means iit for the 
respective "occasions" about their limiting mean ju. 

If in utilizing measurements from a measurement 
process that is in such a state of' complex statistical 
control, one forms an average x^ that is the arith- 
metic mean of a total of N=kn measurements, com- 
posed of n measurements from each of k different 
"occasions/' then the variance of Xj^ will be 



ax ={Xj^—fj.y-- 



i(-4") 



(19) 



From (19) it is clear that, if al is at all _sizable com- 
pared to al, then, for fixed N=kn, x^ will have 
greater precision as a determination of /x when based 
on a large number k of different occasions, with only 
a small number n of measurements from each occa- 
sion. Finally, setting k=l, we see that the mean 
Xi,oin measurements all taken on the same occasion 
considered as a determination of the overall limiting 
mean n has an overall variance o-| = cr^ +(0-^/1); but 
considered as a determination of 11 i, the limiting mean 
for the ith occasion, its variance is only al/n. In 
other words, the "standard error" of a mean such 
as Xi is not unique, but depends on the purpose for 
which it is to be used. 

An unbiased estimate of the overall standard 
deviation a^. of the arithmetic mean of n measure- 
ments taken on a single "occasion" ma}^ be ob- 
tained by the procedure of forniula (17) above, if 
desired, using the average range E employed in con- 
structing the /^-cliart corresponding to the groups of 
averages Xi^. 

Alternatively, an unbiased estimate of the overall 
variance c7| can be obtained directly from the means 
Xi used in constructing the i-chart, by using the 
formula 
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s=l^ 



2] (x,-x) 



k-l 



(20) 



where Xf is the ai-illuiKMic jueaii of the n successive 
observations from the '^tli ''occasion/' (i = l, 2, . . ., k) 
and X is the arithmetic mean of these k means. 

The foregoing concept of a state of complex or 
multistage statistical control can be extended readily 
to more comphw truly ^'multistage'' situations in- 
volving three or more ''levels" of random variation. 

Finally, it is evident from the foregoing that when 
a measurement process is in a state of complex or 
inultistage statistical control, then the difference be- 
tween two individual measurements (or the arith- 
metic means of n measurements) corresponding to 
two different ''occasions" will include the difference 
fXi—fjLr between the limiting means corresponding to 
the two particular occasions involved. In so far as 
such a comparison is regarded as a unique individual 
case, the difference Mi— Mi' is a fixed constant and 
hence a systematic error affecting this comparison. 
On the other hand, if the dift'erence between these 
two individual measurements (or these two arith- 
inetic means) is r(^gar(l(Hl only as a typical instance 
of the outconu^s that might be yielded by the same 
measurement process on other pairs of occasions, then 
the difference iif—iXi' may be regarded as a random 
component having a zero mean and variance 2gI. 

It goes without saying, of course, that if a control- 
chart analysis of the type described above is under- 
taken for the purpose of ascertaining whether the 
process is in a state of complex control, but the points 
plotted on the i-chart are clearly "out of control," 
then the measurement process concerned cannot be 
regarded as statistically stable from occasion to occa- 
sion, and should be used only for comparative measure- 
ment within-occasions. Even when such a measure- 
ment process is used solely for comparative meas- 
urement within "occasions," it needs to be shown 
that comparative measurements or fixed diferences 
are in a state of (simple or complex) statistical con- 
trol, if this measurement process is to be generally 
valid in any absolute sense. Thus in the case of the 
thermometer calibration procedure mentioned in sec- 
tion 2.4 above, one needs to examine the results of 
repeated measurement, occasion after occasion, of 
the difference between two standard thermometers 
Si and aS^ of proven stability in order to determine 
whether the process is or is not in a state of simple 
or complex statistical control. 



3.5. Difficulty of Characterizing the Accuracy of a 
Measurement Process 

Unfortunately, there does not exist any single com- 
prehensive measure of the accuracy (or maccuracy) 
of a measurement process (analogous to the standard 
deviation as a measure of its imprecision) that is 
really satisfactory. Tliis difficulty stems from the 
fact that "accuracy," like "true value," seems to be 
a reasonabh^ definite concept on first thought, but 



as soon as one attempts to specify exactly what one 
means by "accuracy" in a particular situation, the 
concept becomes illusive; and in attempting to re- 
solve the matter one conies face to face, sooner or 
later, with tlie c|iiestion: "Accurate" for what 
2)urpose'! 

Gauss, in his second development (1821-1S2:]) of 
the Metliod of Least Squares clearly recognized the 
difficultly of characterizing sharply the "accuracy" 
of any particular procedure: 

''Quippe quaestio haec per rei naturam aliquid vagi 
implicat, quod limitibus circumscribi nisi per principium 
aliquatenus arbitrarium nequit . . . neque demonstrationi- 
bus mathematicis decidenda, sed libero tantum arbitrio 
remittenda.^' 12 [Gauss 1823, Part I, Art. 6.] 

Gauss himself proposed [loc. cit.] that the meari 
square error of a procedure — that is, a^-^dd—rY, 
where a is its standard deviation; and la — r, its bias — be 
used to characterize its accuracy. While mean square 
error is a useful criterion for comparing the relative 
accuracies of measurement processes differing widely 
in both precision and bias, it clearly does not "tell 
the whole story." For example, if one were to 
adopt the principle that measurement processes 
having the same mean square error were equally 
"accurate," then one would be obhged to consider 
the measurement processes corresponding to the 
three curves shown in figure 3 as being of equal 



12 1 am grateful to my colleague Franz Alt for the following literal translation 
of those phrases: 

"For this question implies, by the very nature of the matter, something 
vague whicli cuimot ])e clearly delimited except by somewhat arbitrary principle 
. . . nor can it he (l(>('i(led by mathematical demonstrations, but must bo left to 
mere arbitrary judgment." 
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accuracy, whereas for many purposes one would 
regard process C (portrayed to the right) as the 
''most accurate/' in spite of the fact that the chances 
of scoring a ''bull's eye'' or "near miss" are greater 
in the case of process A shown in the upper left. 

Alternatively, if one were to say that two measure- 
ment processes were equally accurate when exactly 
the same proportion P of the measurements of each 
lay within ±5 units from the true value, then for 
P=0.5 one would be obliged to say that the measure- 
ment processes corresponding to curves e and d 
in the lower half of figure 1 were equally accurate, 
and that the measurement process corresponding to 
curve a in the upper half of the same figure was 
slightly more accurate than either e or d. Or, 
taking P=0.95, one would be obliged to say that 
the measurement processes corresponding to the 
three curves shown in figure 4 were equally accurate. 
From these, and other cases easily constructed, it is 
readily seen that it is unsatisfactory to regard two 
measurement processes as being equally accurate if 
the same specified fraction P of the measurements 
produced by each lie within the same distance from 
the true value. 

Thus one is led by the force of necessity to the 
inescapable conclusion that ordinarily (at least) 
two numbers are needed to adequately characterize 
the accuracy of a measurement process. And this 
has been recognized by the American Society for 
Testing and Materials in their recent recommenda- 
tions [ASTAI 1961, pp. 1759-1760]: 
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''Generally the index of accuracy will consist of two or 
more different numbers. Since the concept of accuracy 
embraces not only the concept of precision but also the idea 
of more or less consistent deviation from the reference level 
(systematic error or bias), it is preferable to describe accuracy 
by separate values indicating precision and bias." 

The fact of the matter is that two numbers ordinarily 
suffice only because the ^^end results" of measurement 
and calibration programs are usually averages or 
adjusted values based on a number of independent 
^ primary measurements/^ and such averages and 
adjusted values tend to be normally distributed to 
a very good approximation when four or more "pri- 
mary measurements'' are involved. This is illus- 
trated by figure 5, which shows the distributions of 
individual measurements of two unbiased measure- 
ment processes with identical standard deviations 
but having uniform and normal "laws of error/' 
respectively, together with the corresponding distri- 
butions of arithmetic means of 4 independent 
measurements from these respective processes^ — 
these latter two distributions are depicted by a single 
curve because the differences between the two 
distributions concerned are far less than can be 
resolved on a chart drawn to this scale. Since both 
of the processes concerned are unbiased, "accuracy" 
thus becomes only a matter of ^ ^precision" — or does 
it?^ — both curves for n=l have the same standard 
deviation, do they reflect equal ^accuracy"? Would 
not the answer depend on the advantages to be 
gained from small errors balanced against the serious- 
ness of large errors, in relation to the purpose for 
which a single measurement from one or the other 
is needed? But '^the problem" disappears nicely 
if averages of 4 measurements are to be used. 
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4. Evaluation of the Precision, and of Cred- 
ible Bounds to the Systematic Error of a 
Measurement Process 

As we have just seen, two numbers are ordinarily 
needed to characterize the accuracy of a measure- 
ment process, the one indicating its precision, and 
the other its bias. In practice, however, the bias of 
a measurement process is unknown and unknowable 
because the '^true values'' of quantities measured are 
ahnost always unknown and unknowable. The 
principle exception is when one is measuring a 
difference that is by hypothesis identically zero. 
If the bias of a measurement process could be, and 
were known exactly, then one would of course 
subtract it off as a ^'correction'' and thus dispose of 
it entirely. Since ordinarily we cannot expect to 
know the exact magnitude of the bias of a measure- 
ment process, we are forced in practice to settle 
for credible bounds to its likely magnitude^— much 
as did Steyning and the thief in chapter VI of Kipling's 
story, Captains Courageous: ^^Steyning tuk liim for 
the reason that the thief tuk the hot stove- — bekaze 
for there was nothing else that season". Conse- 
quently, neither the bias nor the accuracy of any 
measurement process, or method of measurement, 
can ever be known in a logical sense. The precision 
of a measurement process, however, can be measured 
and known. (Compare Deming [1950, p. 17].) 

j 4.1. Evaluation of the Precision of a Measurement 
Process 

In the foregoing we liave stressed tliat a measure- 
ment operation to qualify as a measurement process 
must have attained a state of statistical control; and 
that until a measurement operation has been 
'^debugged" to the extent that it has attained a 
state of statistical control, it cannot be regarded in 
any logical sense as measuring anytliing at all. It 
is also clear, from our discussion of the control-chart 
techniques for determining whether in any given 
instance one is entitled to 'Sict for the present as if 
a state of statistical control has been attained, that 
a fairly large amount of experience with a particular 
measurement process is needed before one can 

' resolve the question in the affirmative. Once a 
measurement process has attained a state of sta- 
tistical control, and so long as it remains in this 
state, then an estimate of the standard deviation of 
the process can be obtained from the data employed 
in establishing control, as we have indicated above. 
Since the precision of a measurement process 
refers to, and is determined by the characteristic 

^ '^closeness together" of successive independent meas- 
urements of a single magnitude generated by repeated 
application of the process under specified conditions, 
it is clearly necessary in detennining whetlier a 
measurement operation is or is not in a state of 
statistical control, and in evaluating its precision to 
be reasonably definite on what variations of procedure, 
apparatus, environmental conditions, observers, 
operators, etc., are allowable in ' 'repeated appli- 



cations'' of what will be considered to be the same 
measurement process applied to the measurement of 
the same quantity under the same conditions. If 
whatever measure of the precision and bounds to 
the bias of the measurement process we may adopt 
are to provide a realistic indication of the accuracy 
of this process in practice, then the ' 'allowable varia- 
tions" must be of sufficient scope to bracket the 
range of circumstances commonly met in practice. 
Scientists and engineers commonly append ''probable 
errors" or "standard errors" to the results of their 
experiments and tests. These measures of impreci- 
sion are supposed to indicate the extent of the 
reproducibility of these experiments or tests under 
"essentially the same conditions," but there are 
great doubts whether the "probable errors" and 
"standard errors" generally presented actually have 
this meaning. The fault in most cases is not with 
the statistical formulas and procedures used to com- 
pute such probable errors or standard errors from 
the measurements in hand, but rather with the 
limited scope of the "conditions" sampled in taking 
the measuremcTits. 

a. Concept of a ''Repetition" of a Measurement 

As a very mininuun, a "repetition" of a measure- 
ment by the same measurement process should "leave 
the door open" to, and in no way inhibit changes of 
the sort that would occur if, on termination of a 
given series of measurements, the data sheets were 
stolen and the experimenter were to repeat the 
series as closely as possible with the same apparatus 
and auxiliary equipment following the same instruc- 
tions. In contrast, a "repetition" by the same 
method of measurement should permit and in no way 
inhibit the natural occurrence of such changes as 
will occur if the experimenter were to mail to a 
friend complete details of the apparatus, auxiliary 
equipment, and experimental procedure employed — 
i.e., the written text specification that defines the 
"method of measurement" concerned — and the 
friend, using apparatus and auxiliary equipment of 
the same kind, and following the procedural instuc- 
tions received to the best of his ability, were then, 
after a little practice, to attempt a repetition of the 
measurement of the same quantity. vSuch are the 
extremes, but there is a "gray region" between in 
which there is not to be found a sharp line of de- 
marcation between the "areas" corresponding to 
"repetition" by the same measurement process, and 
and to "repetition" by the same method of measure- 
ment. 

Let us consider "repetitions" by the same meas- 
urement process more fully. Such repetitions will 
undoubtedly be carried out in the same place, i.e., 
in the same laboratory, because if it is to be the 
same measurement process, the very same apparatus 
must be used. But a "repetition" cannot be carried 
out at the same time. How great a lapse of time 
should be allowed, nay required, between "repeti- 
tions"? This is a crucial question. Student 
gives an answer in a passage from which we quoted 
above [Student 1917, p. 415]: 
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''Perhaps T may be permitted to restate my opinion as to 
the best way of judging the accuracy of physical or chemical 
determinations. 

''After considerable experience I have not encountered 
any determination which is not influenced by the date on 
which it is made; from this it follows that a number of 
determinations of the same thing made on the same day are 
likely to lie more closely together than if the repetitions had 
been made on different days. 

"It also follows that if the probable error is calculated 
from a number of observations made close together in point 
of time, much of the secular error will be left out and for 
general use the probable error will be too small. 

"Where then the materials are sufficiently stable it is 
well to run a number of determinations on the same material 
through any series of routine determinations which have to be 
made, spreading them over the whole period." 

Another important question is: Are ^^repetitions^^ 
by the same measurement process, to be limited to 
repetitions by the same observers and operators, 
using the same auxiliary equipment (bottles of 
reagents, etc.) ; or enlarged to include repetitions 
with nominally equivalent auxiliary equipment, by 
various but equivalently trained observers and 
operators? I believe that everyone will agree that 
substitution, and certainly replacement, of bottles 
of reagents, of batteries as sources of electrical 
energy, etc., by ^ ^nominally equivalent materials" 
must be allowed. And any calibration laborator}^ 
having a large amount of ^'business" will certainly, 
in the long run at any rate, have to face up to allow- 
ing changes, even replacement of observers and 
operators — and, ultimately, even of apparatus. 

A very crucial question, not always faced squarely, 
is: in complete ^ ^repetitions'^ by the same measure- 
ment process, are such ^^repetitions" to be limited to 
those intervals of time over which the apparatus is 
used ^^as is" and ' ^undisturbed," or extended to 
include the additional variations that almost always 
manifest themselves when the apparatus is dis- 
assembled, cleaned, reassembled, and readjusted? 
Unless such disassembly, cleaning, reassembly, and 
readjustment of apparatus is permitted among the 
allowable variations affecting a ^ ^repetition" by the 
same measurement process, then there is very little 
hope of achieving satisfactory agreement between 
two or more measurement processes in the same 
laboratory that differ only in their identification w4th 
different pieces of apparatus of the same kind. In 
practice it is found that statistical control can be 
attained and maintained under such a broad concept 
of '^repetition" only through the use of reference 
standards of proven stability. Furthermore, by 
thus more squarely facing the issue of the scope of 
variations allowable with respect to ^ 'repetitions" 
by the same measurement process, we shall go a 
long way toward narrowing the gap between a 
^'repetition" by the same measurement process and 
by the same method of measurement. 

As we have said before, if whatever measures of 
the precision and bias of a measurement process we 
may adopt are to provide a realistic indication of the 
accuracy of this process in practice, then the "allow- 
able variations" must be of sufficient scope to bracket 
the range of circumstances commonly met in prac- 
tice. Furthermore, any experimental program that 
aims to determine the precision and systematic error, 



and thence the accuracy of a measurement process, 
must be based on an appropriate random sampling 
of this "range of circumstances," if the usual tools 
of statistical analysis are to be strictly applicable. 
Or as Student put it, "the experiments must be 
capable of being considered to be a random sample 
of the population to which the conclusions are to be 
applied. Neglect of this rule has led to the estimate 
of the value of statistics which is expressed in the 
crescendo lies, damned lies, statistics\'' [Student 
1926, p. 711.] 

Wlien adequate random sampling of the appro- 
priate "range of circumstances" is not feasible, or 
even possible, then it is necessary to compute, by 
extrapolation from available data, a more or less 
subjective estimate of the "precision" of the end 
results of a measurement operation, to serve as a 
substitute for a direct experimental measure of their 
"reproducibility." Youden [1962d] calls this "ap- 
proach the 'paper way' of obtaining an estimate of 
the [precision]." Its validity, if any, "is based on 
subject-matter knowledge and skill, general informa- 
tion, and intuition — but not on statistical method- 
ology" [Cochran et al. 1953, p. 693]. 

b. Some Examples of Realistic "Repetitions" 

As Student remarked [1917, p. 415], "The best way 
of judging the accuracy of physical or chemical 
determination . . . [when] the materials are suffi- 
ciently stable ... is ... to run a number of 
determinations on the same material thru any series 
of routine determinations which have to be made, 
spreading them over the whole period." To this 
end, as well as to provide an overall check on pro- 
cedure, on the stability of reference standards, and 
to guard against mistakes, it is common practice in 
many calibration procedures, to utilize two or more 
reference standards as part of the regular calibration 
procedure. 

The calibration procedure for liquid-in-glass therm- 
ometers, referred to in section 2.4 above, is a case in 
point. A measurement of the difference between the 
two standards Si and S2 is obtained as by-product 
of the calibration of the four test thermometers 
Ti, T2, T3, and T4 in terms of the (corrected) readings 
of the two standards. It is such remeasurements of 
the difference between a pair of standard thermom- 
eters from "occasion" to "occasion" that constitutes 
realistic "repetitions" of the calibration procedure. 
The data yielded by these "repetitions" are of 
exactly the type needed (a) to ascertain whether or 
not the process is in a state of statistical control; and 
if so, (b) to determine its overall standard deviation. 

Similarly, in the calibration of laboratory standards 
of mass at the National Bureau of Standards, 
"known standard weights are calibrated side-by-side 
with [the] unknown weights" [Aimer et al., 1962, 
p. 33]. Indeed, weights whose values are otherwise 
determined "are not said to have been 'calibrated'. 
That term is reserved for measurements based on at 
least two mass standards." [loc. cit., p. 43.] In the 
specimen work sheets exhibited by Aimer et al., the 
auxiliary standards involved are those from the 
Bureau's "NH series" of reference standards known 
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by the designcatioiis NH50, NH20, and NHlOi 
respectively. It is the ineasurenients obtained in 
routine calibrations of the diU'erences between the 
values of these standards and their accepted values 
that not only provide valuable checks on day-to-day 
procedure, but also serve as the basis for determina- 
tion of the overall standai'd deviation of this calibra- 
tion process. 

A third example is ])rovided by the nietliod 
followed at the National Bureau of Standards for 
testing; alternating -cur rent watthour meters, whicli has 
been described in some detail by Spinks and Zapf 
[1954]. Four reference watthour meters are involved. 
One of these, termed '^the Standard Watthour 
Meter,'' is located in the device portrayed in figure 
1 of the paper by Spinks and Zapf. The other three 
are located in a temperature-controlled cabinet. 
A '^test" of a watthour meter sent to the Bureau 
involves not only a comparison of this watthour 
meter with the Standard Watthour Meter, but also 
<'oniparisons of each of tlie Comparison Standard 
Watthour Meters with the Standard Watthour 
Meter. It is from tlie data yielded by these inter- 
comparisons of the Standard Watthour Meter and 
the Comparision Standard Wattliour Meters that 
the standard deviation of this test procedure is 
evaluated. Spinks and Zapf's section on ''Precision 
and Accuracy Attainable" is notable for its ex- 
ceptional lucidity as well as for its conijdeteness 
with respect to relevant details. 

Some additional examples of realistic ''repetitions" 
ai'e discussed by Youden |1962c]. 

4.2. Treatment of Inaccuracy Due to Systematic 

Errors of Assignable Origins but of Unknown 

Magnitudes 

As we remarked in section ;3.)^b above, tlie sys- 
tematic error of a measurement process will ordinarily 
have both constant and variable components. For 
convenieiu'e of ex])osition, it is customary to ]*egard 
tlie individual com])onents of the overall systenuitic 
error of a measurenuMit or calibration process as 
elemental or constituent "systematic errors" and to 
refer to tliem simply as "systematic errors,'^ for 
short. Included among such "systematic errors'^ 
affecting a particular measurement or calibration 
process are: ". . . all those errors which cannot be 
regarded as fortuitous, as partaking of the nature 
of chance. The}^ are characteristic of the system 
involved in the work; they may arise from errors in 
theory or in standards, from imperfections in the 
apparatus or in the observer, from false assumptions, 
etc. To them, the statistical theory of error does not 
apply. '^ [Dorsey 1944, p. 6; Dorsev and Eisenhart 
1953, p. 104.1 

The overall systematic error of a measurement 
process ordinarily consists of elemental "systematic 
errors" due to both assignable and unassignable 
causes. Those of unknown (not thought of, not 
yet identified, or as yet undiscovered) origin are 
always to be feared; allowances can be nuide only 
for those of recognized oi'igin. 

Since the "known" systematic errors affecting a 
measurement ])rocess ascribable to specific origins 



are ordinaril}^ determinate in origin only, their 
individual values ordinai'ily being unknown both 
wdth respect to sign and magnituck^, it is not possible 
to evaluate their algebraic sum and thereby arrive 
at a value for the overall systematic error of the 
measurement process concerned. In consequence, it 
is necessary to arrive at bounds for each of the 
individual components of systematic error that may 
be expected to yield nonnegligible contributions, 
and then from these bounds arrive at credible bounds 
to their combined effect on the measurement process 
concerned. Both of these steps are fraught with 
difficulties. 

Determination of reasonable bounds to the 
s^^stematic error likely to be contributed by a 
particular origin or assignable cause necessarily 
involves an element of judgment, and the limits can- 
not be set in exactitude. By assigning ridiculously 
wide limits, one could be practically certain that 
the actual error due to a particular cause would never 
lie outside of these limits. But such limits are not 
likely to be very helpful. The narrower the range 
between the assigned limits, the greater the uneasi- 
ness one feels that the assigned limits will not 
include whatever systematic error is contributed 
by the cause in question. But a decision has to 
be made; and on the basis of theory, other related 
measTU'cments, a careful study of the situation in 
liand, especially its sensitivity to snuill changes in 
the factor concerned, and so forth, "the experi- 
menter presently will feel justified in saying that 
he feels, or believes, or is of the opinion," that the 
systematic error due to the particular source in 
question does not exceed such and such limits, 
"meaning thereby, since he makes no claim to 
omniscience, that he has found no reason for 
believing" that it exceeds these limits. In other 
words, ' nothing has come to light in tlie course 
of tlie work to inxUcate" that the systematic error 
concerned lies outside the stated range. [Dorsey 
1944, pp. 9-10; Dorsev and Eisenhart, 195:^, pp. 
105-107.] 

This being done to each of the recognized potential 
sources of systematic error, the problem remains 
how to determine credible bounds to their combined 
effect. Before considering this problem in detail, 
it will be helpful to digress for a moment, to consider 
an instructive example relating to the combined 
effect of constant errors in an everyday situation. 

a. An Instructive Example 

Consider the hypothetical situation of an indi- 
vidual who is comparing his checkbook balance with 
his bank statement. To this end he needs to know 
the total value of his checks outstanding. Loathing 
addition, or perhaps, simply to save time, he adds 
up only the dollars, neglecting the cents, and thus 
arrives at a total of, say, $312, for 20 checks out- 
standing. Adding a correction of 50 cents per check, 
or $10 in all, he takes $322 as his estimate. Within 
what limits should he consider the error of this 
estimate to lie? 

The round-off error cannot exceed ±50 cents per 
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check, so that barring mistakes in addition, he can 
be absolutely certain that the total error of his 
estimate does not exceed ±$10. But these are 
extremely pessimistic limits: they correspond to 
every check being in error by the maximum possible 
amount and all in the same direction. (Actually 
the maximum possible positive error is 49 cents per 
check or +$9.80 in all.) 

To be conservative, but not so pessimistic, one 



might ^^allow" a maximum error of ±50 cents 
per check, but consider it reasonable to regard their 
signs as being equally likely to be plus or minus. 
In this way one would be led to conclude 'Vith 
probabilitv 0.95^' that the total error lies between 
±$7.00; W ^Vith probabihty 0.99,^^ between 
±$8.00, as shown in the column headed ^ ^binomial" 
m table 1, for n=20. The ^'saving'' by tliis pro- 
cedure is clearly not great. 



Table 1 


Limits of error 


of a sum of n items indicated by 


various 


methods 


of evaluation 






Binomial 


Uniform 


Triangular 


Normal 


2cr=0.5 


Normal 


3o-=0.5 


n 


Absolute 

± 


















0.95 ± 


0.99 ± 


0.95 ± 


0.99 ± 


0.95 ± 


0.99 ± 


0.95 ± 


0.99 ± 


0.95 ± 


0.99 ± 


1 


0.50 


0.50 


0.50 


0.48 


0.50 


0.39 


0.45 


0.49 


0.64 


0.33 


0.43 


2 


1.00 


1.00 


1.00 


0.78 


0.90 


0.56 


0.71 


0.69 


0.91 


0.46 


0.61 


3 


1.50 


1.50 


1.50 


0.97 


1.19 


0.69 


0.88 


0.85 


1.12 


0.57 


0.74 


4 


2.00 


2.00 


2.00 


1.12 


1.41 


0.80 


1.03 


0.98 


1.29 


0.65 


0.86 


5 


2.50 


2.50 


2.50 


1.25 


1.60 


0.89 


1.15 


1.10 


1.44 


0.73 


0.96 


6 


3.00 


2.50 


3.00 


1.38 


1.76 


0.98 


1.29 


1.20 


1.58 


0.80 


1.05 


/ 


3.50 


3.00 


3.50 


1.49 


1.91 


1.06 


1.39 


1.30 


1.70 


0.86 


1.14 


8 


4.00 


3.50 


3.50 


1.59 


2.05 


1.13 


1.49 


1.39 


1.82 


0.92 


1.21 


9 


4.50 


3.50 


4.00 


1. 69 


2.18 


1.20 


1.58 


1.47 


1.93 


0.98 


1.29 


10 


5.00 


4.00 


4.50 


1.78 


2.31 


1.26 


1.66 


1.55 


2.04 


1.03 


1.36 


15 


7.50 


5.50 


6.00 


2.19 


2.88 


1.55 


2.04 


1.90 


2.49 


1.27 


1.69 


20 


10.00 


7.00 


8.00 


2.53 


3.33 


1.79 


2.35 


2.19 


2.88 


1.46 


1.92 


25 


12.50 


8.50 


9.50 


2.83 


3.72 


2.00 


2.63 


2.45 


3.22 


1.63 


2.15 


30 


15. 00 


10.00 


11.00 


3.07 


4.03 


2.19 


2.88 


2.68 


3.53 


1.79 


2.35 


40 


20.00 


13. 00 


14.00 


3.58 


4.70 


2.53 


3.33 


3.10 


4.07 


2.07 


2.72 


60 


25. 00 


16.00 17.00 


4.00 


5.26 


2.83 


3.72 


3.46 


4.55 


2.31 


3.04 


60 


30.00 


19. 00 20. 00 


4.38 


5.76 


3.10 


4.07 


3.80 


4.99 


2.53 


3.33 



Alternatively, one might consider it to be more 
^'realistic" to regard the individual errors as inde- 
pendently and uniformly distributed between —50 
cents and +50 cents, concluding 'Vith probability 
0.95" that the total error does not exceed ±$2.53; 
or 'Vith probability 0.99,'^ is not greater then 
±$3.33 — as shown in the columns under the heading 
^'uniform" in table 1. It is clear that a considerable 
reduction in the estimate of the total error is achieved 
by this approach. 

Strictly speaking, the foregoing analyses via the 
theory of probability are both inapplicable to the 
problem at hand: each round-off error is a fixed 
number between ±50 cents, and their sum is a fixed 
number between ±$10. If it were true that round- 
off errors in such cases were uniformh' distributed 
between ±50 cents, then, if one made a habit of 
evaluating limits of error according to this procedure, 
one could expect the limits of error so calculated to 
include the true total error in 95 percent, or 99 per- 
cent of the instances in which this procedure was 
used in the long run . Round-off errors in such cases are 
almost certainly not uniformly distributed between 
±50 cents. (Many items are priced these days at 
$2.98 etc., and this will distort the distribution of the 
cents-portion of one's bills but added sales taxes no 
doubt have a ^ ^smoothing'' effect.) 

Nevertheless, I believe that you will agree that if, 
in the hypothetical case under discussion, the 
checkbook balance, with an allowance of $322 for 
checks outstanding, failed to agree with the bank 
statement to within $2.53 (or $3.33), our ^^friend" 
would do well to check into the matter more thor- 
oughly. And, alternatively, if his checkbook balance 
so adjusted, and the bank statement, agreed to 
within $2.53 (or $3.33), it would be reasonablv 



'^safe'' for him to ^^act for the present as if his 
balance and the bank statement were in agreement. 
(See Eisenhart [1947a, p. 218] for discussion of a 
similar example relating to computation with 
logarithms.) 

b. Combination of Allowances for Systematic Errors 

The foregoing example suggests that a similar 
procedure be used for arriving at credible limits to 
the likeh^^ overall effect of systematic errors due to a 
number of different origins. A number of additional 
difficulties confront us, however, in this case. To 
begin with, in view of the inexactness with which 
bounds can ordinarily be placed on each of the indi- 
vidual components of systematic error, it is not 
possible to say with absolute certainty that their 
combined effect lies between the sum of the positive 
bounds and the sum of the negative bounds. 

Second, even if it were possible to scale the situa- 
tion so that the bounds for each of the components 
of systematic error was the same, say, ±A, there 
would still remain the problem of translation into an 
appropriate probability calculus. Most persons 
would, I believe, regard the ^4)inomiar^ approach 
(corresponding to equal probability of maximum 
error in either direction), as too pessimistic; and the 
approach via a uniform distribution of error, as a bit 
conservative, on the grounds that one intuitively 
feels that the individual errors are somewhat more 
likely to lie near the centers than near the ends of 
their respective ranges. Therefore, one might at- 
tempt to simulate this ' 'feeling' ' by assuming the 
''law of error'' to be an isosceles triangle centered at 
zero and ends at ± A; or, more daringly, by assuming 
the "law of error" to be approximately normal with 
A corresponding to 2 "o-" or even 3 'V." 
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Ilnfortunalolv wlialevoi' ^^probability limits" may 
be placed upon tlie eoinhined effects of several inde- 
pendent systematic errors by these procedures are 
quite sensitive to the assuni])tion made at this stage, 
as is evident from table 1. Therefore, anyone who 
uses one of tliese methods for the ^^combination of 
errors'^ should indicate explicitly which of these (or 
an alternative metliod) he has used. When (a) the 
number of systematic errors to be combined is large, 
(b) the respective rang-es are approximately equal in 
size, and (c) one feels ' 'fairly sure" that the indi- 
vidual errors do not fall outside of their respective 
ranges, then my personal feeling is that the ''uni- 
form" method is probably a wee bit conservative 
but ''safe"; the triangular method is a bit "too 
daring"; the normal method with "o-" = A/3 ordi- 
narily "much too daring"; but the normal method 
with' "o-" = A/2, probably "not too daring." When 
(b) and (c) hold but ii is small, then it will probably 
be safe to use the "uniform" metbod with "A" taken 
e([ual to the average of the indivichuil ranges. 
Otlier cases, e.g., wlien n is large but, say, one or two 
of tlie ranges is (are) nuich larger than the others 
and tend(s) to dominate the situation, rec[uires 
special consideration which is beyond tlie scope of 
the present paper. 

4.3. Expression of the Inaccuracy of a Measurement 
Process 

By whatever means crecbble bounds to the likely 
overall systematic error of the measurement process 
ai'e obtained they should not be combined (by simple 
addition, by "quadrature," or otherwise) with an ex- 
perimentally determined measure of its standard de- 
viation to obtain an overall index of its accuracy (or, 
inore correctly, of its inaccuracy). Rather (a) tlie 
standard deviation of the process and (b) credible 
bovnids to its systematic error should be stated sepa- 
rately, because, as we showed in figure 3, a meas- 
urement process having standard deviation (r=0.25 
and a bias A = Vl5/16=0.97 is for inost purposes 
"more accurate" than a measurement process having 
zero bias and standard deviation o-=l, so that a proc- 
ess with o-=0.25 and a bias less than ±0.97 will a 
fortiori be "more accurate." 

Finally, if the uncertainties in the assigned value 
of a national standard or of some fundamental con- 
stant of nature (e.g., in the volt as maintained at the 
National Bureau oj Standards, or in the speed of light 
c, or in the acceleration of gravity g on the Potsdam 
basis) is an important potential source of systematic 
error affecting the measurement process, no allowance 
for possible systematic error from this source should 
be included ordinarily in evaluating overall bounds 
to the systematic error of the measurement process. 
Since the error concerned, what ever it is, affects all 
results obtained by the method of measurement in- 
volved, to include an allowance for this error would 
be to make everybody's results appear unduly in- 
accurate relative to each other. Instead, in such in- 
stances one should state (a) that results obtained by 
the measurement process concerned are in terms of 
the volt (or the wattliour, or the kilogram, etc.) 



"as maintained at the National Bureau of Stand- 
ards" [McNish and Cameron 1960, p. 102], or 
"correspond to the speed of hgbt (—2.997925 XIO^^ 
cm/sec. exactly/^ say; and (b) that the indicated 
bounds to the systematic ei'ror of the process are 
exclusive of whatever errors may be present from 
this (or these) source(s). Given such information, 
experts can make such additional allowances, as may 
be needed, in fundamental scientific work; and com- 
parative measurements within science ami industry 
within the United States will not appeal* to be less 
accurate than they very likely are for the purposes 
for which they are to be used. 



It is a pleasure to acknowledge the technical assist- 
ance of Janace A. Speckman in several phases of the 
preparation of this paper. 

5. Bibliography 

Airy, (icorgo Biddell (1861), On the Algebraical and Numeri- 
cal Theory of Krrors of Observations and the Combination 
of Observations (Alacmillan and Co., Cambridge and 
London). 

Aimer, H. E., L. B. Macurdy, H. S. Peiser, and E. A. Week 
(19()2), Weight calibration schemes for two knife-edge 
dircH't-reading l)alances, J. Research NB8 66C (Eng. and 
Instr.) No. 1, pp. 33-44. 

American Standards Association (1958a), Guide for cjuality 
control, American Standard Z 1.1-1958. (American 
Standards Association, 70 East Forty-fifth St., New York 
17, N.Y.) 

American Standards Association (1958b), Control chart 
method of analyzing data, American Standard Z 1.2-1958. 
(American Standards Association, 70 East Forty-fifth St., 
New York 17, N.Y.) 

American Standards Association (1958c), Control chart 
method of controlling cpiality during production. Ameri- 
can Standard Z 1.3-1958. (American Standards Associa- 
tion, 70 East Forty-fifth St., New York 17, N.Y.) 

American Society for Testing Materials (1951), ASTM 
Manual on Quality Control of Materials Special Technical 
Publication 15-C (American Society for Testing Materials 
19 1() Race St., Philadelphia 3). 

American Society for Testing and Materials (1961), Use of 
the terms precision and accuracy as applied to measure- 
ment of a property of a material, ASTM Designation: 
E 177-6 IT. Reprinted from ASTM Standards, Pt 11, 
pp. 1758-1766. 

Baird, D.C. (1962), Experimentation: An Introduction to 
Measurement Theory and Experiment Design, (Prentice- 
Hall, Inc., Englewood Cliffs, N.J.). 

Beers, Yardley (1953), Introduction to the Theory of Error, 
(Addison- Wesley Publishing Co., Cambridge 42, Mass.). 

Picking, Charles A. (1952), The reliability of measured 
values^an illustrative example. Photogrammetric En- 
gineering XVIII, pp. 554-558. 

Cameron, J. M. (1951), The use of components of variance 
in preparing schedules for the sampling of baled wool. 
Biometrics 7, pp. 83-96. 

Chauvenet, William (1868), A Manual of Spherical and 
Practical Astronomy Vol. II, 4th edition, (J. B. Lippincott 
and Co., Philadelphia). 

Cochran, William G., Frederick Mosteller, and John W. 
Tukey (1953), Statistical problems of the Kinsey report, 
J. Am. Stat. Assoc. 48, pp. 673-716. 

Cochran, William G., Frederick Mosteller, and John W. 
Tukey (1954), Principles of sampling, J. Am. Stat. Assoc. 
49, pp. 13-35. 

Crow, Edwin L. (1960), An analysis of the accumulated error 
in a hierarchy of calibrations. IRE Trans. Instr. 1-9, 
pp. 105-114. 



185 



Deming, W. Edwards and Raymond T. Birge (1937), On the 
Statistical Theory of Errors, reprinted from Reviews of 
Modern Physics 6, pp. 119-161 (1934) with additional 
notes dated 1937 (The Graduate School, U.S. Department 
of Agriculture, Washington 25, D.C.). 

Deming, W. Edwards (1943), Statistical Adjustment of Data 
(John Wiley & Sons, Inc., New York, N.Y.) 

Deming, W. Edwards (1950), Some Theory of Samphng 
(John Wiley & Sons, New York, N.Y.). 

Dorsey, N. Ernest (1944), The velocity of hght, Transactions 
American Philosophical Society XXXIV, pp. 1-110. 

Dorsey, N. Ernest and Churchill Eisenhart (1953), On abso- 
lute measurement. The Scientific Monthly LXXVII, pp. 
103-109. 

Eisenhart, Churchill (1947a), Effects of rounding or group- 
ing data. Chapter 4 of Techniques of Statistical Analysis, 
edited by C. Eisenhart, M. W. Hastay, W. A. Wallis 
(McGraw-Hill Book Co., New York, N.Y.). 

Eisenhart, Churchill (1947b), Planning and interpreting ex- 
periments for comparing two standard deviations, Chapter 
8 of Techniques of Statistical Analysis, edited by C. 
Eisenhart, M. W. Hastay, W. A. Wallis (McGraw-Hill 
Book Co., New York, N.Y.). 

Eisenhart, Churchill (1949), Probability center lines for 
standard deviation and range charts. Industrial Quality 
Control vi, pp. 24-26. 

Eisenhart, Churchill (1952), The reliability of measured 
values — fundamental concepts, Photogrammetric Engi- 
neering XVIII, pp. 542-554 and 558-565. 

Eisenhart, Churchill (1962), On the realistic measurement of 
precision and accuracy, ISA Proceedings of the Eight 
National Aero-Space Instrumentation Symposium held in 
Washington, May 1962, pp. 75-83. 

Feller, WiUiam (1957), An Introduction to Probabihty 
Theory and its Applications, Vol. 1, 2d edition (John 
Wiley & Sons, New York, N.Y.). 

Galilei, Galileo (1638), Discorsi e Dimostrazioni Matema- 
tiche Intorno a Due Nuove Scienze, Leiden. 

Galilei, Galileo (1898), Discorsi e Dimostrazioni Matematiche 
Intorno a Due Nuove Scienze, Le Opere di Galileo Galilei 
(Edizione Nazionale) VIII, pp. 39-448, Firenze. 

Galilei, Galileo (1914), Dialogues Concerning Tw^o New 
Sciences, translated by Henry Crew and Alfonso de Salvio, 
with an Introduction by Antonio Favaro (The Mac- 
millan Co., New York, N.Y.). 

Gauss, C. F. (1809), Theoria Matus Corporum Coelestium 
in Sectionibus Conicis Solem Ambientium, Frid. Perthes 
et I. H. Besser, Hamburg; reprinted in Carl Friedrich 
Gauss Werke, Band VII, Gotha, 1871. 

Gauss, C. F. (1823), Theoria Combinationis Observationum 
Erroribus Minimis Obnoxiae Commentationes societatis 
regiae scientiarum Gottingensis recentiores, V, pp. 1-104, 
Gottingae; reprinted in Carl Friedrich Gauss Werke, 
Band IV, Gottingen, 1873. 

Gauss, C. F. (1839), letter to F. W. Bessel dated February 
28, 1839, reproduced in ^'Kritische bemerkungen zur 
methode der kleinsten quadrate," pp. 142-148 in Carl 
Friedrich Gauss Werke, Band VIII, (B. G. Teubner, 
Leipzig, 1900). 

Gauss, C. F. (1857), Theory of the Motion of the Heavenly 
Bodies Moving About the Sun in Conic Sections; English 
translation by Charles Henry Davis (Little, Brown and 
Co., Boston). 

Gnedenko, B. V. (1962), The Theory of Probability (Enghsh 
translation by B. D. Sechler), (Chelsea Publishing Co., 
New York, N.Y.). 

Hermach, F. L. (1961), An analysis of errors in the calibra- 
tion of electric instruments. Communication and Electronics 
(AIEE) 54, pp. 90-95. 

Hillebrand, W. F., G. E. F. Lundell, H. A. Bright, J. I. 
Hoffman, Apphed Inorganic Analysis, 2d ed. (1953), (John 
Wiley & Sons, Inc., New York, N.Y.). 

Holman, Silas Whitcomb (1892), Discussion of the Precision 
of Measurements, (John Wiley and Sons, New York, N.Y.). 

Keyser, Cassius J. (1922), Mathematical Philosophy, (E. P. 
Dutton and Co., New York, N.Y.). 

Kline, S. J. and F. A. McCHntock (1953), Describing uncer- 
tainties in single-sample experiments, Mech. Eng. 75, 
pp. 3-8. 



Laplace, Pierre Simon (1886), Theorie Analytique Des Prol)- 
abilites; 3d edition, Vol. 7 of Oeuvres Completes de 
Laplace publiees sous les auspices de TAcademie^ des Scien- 
ces, (Gauthier-Villars, Imprimeur-Libraire de I'Ecole Poly- 
technique, du Bureau des Longtudes, Successeur de 
Mallet-Bachelier, Quai des Grands- Augustins, 55, Paris). 

McNish, A. G. and J. M. Cameron (1960), Propagation of 
error in a chain of standards, IRE Trans. Instr. 1-9, 
pp. 101-104. 

Millikan, R. A. (1903), Mechanics, Molecular Physics, and 
Heat, (Ginn and Co., New York, pp. 195-196). 

Murphy, R. B. (1961), On the meaning of precision and 
accuracy, , Materials Research and Standards 4, pp. 264-267. 

NPL (1957), Calibration of temperature measuring instru- 
ments. National Physical Laboratory Notes on Applied 
Science, No. 12, pp. 29-30, (Her Majesty's Stationery 
Office, London) . 

Ostle, Bernard (1954), Statistics in Research, (The Iowa 
State College Press, Ames, Iowa). 

Parzen, Emanuel (1960), Modern Probability Theory and 
its Applications, (John Wiley & Sons, New York, N.Y.). 

Proschan, Frank (1953), Confidence and tolerance intervals 
for the normal distribution, J. Am. Stat. Assoc. 48, pp. 
550-564. 

Rossini, F. D. and W. Edwards Deming (1939), The assign- 
ment of uncertainties to the data of chemistry and physics, 
with specific recommendations for thermochemistry, 
J. Wash. Acad. Sci. 29, pp. 416-441. 

Schenck, Hilbert, Jr. (1961), Theories of Engineering Experi- 
mentation, (McGraw^-Hill Book Co., Inc., New York, N.Y.). 

Schrock, Edward M. (1950), Quality Control and Statistical 
Methods, (Reinhold Pubhshing Corp., New York 48, N.Y.). 

Shewhart, W. A. (1931), Economic Control of Quality of 
Manufactured Product, (D. Van Nostrand Companv, Inc., 
New York, N.Y.). 

Shewhart, Walter A. (1939), Statistical Method from the 
Viewpoint of Quality Control, (The Graduate School, U.S. 
Department of Agriculture, Washington, D.C.). 

Shewhart, Walter A. (1941), Contribution of statistics to 
the science of engineering. University of Pennsylvania Bi- 
centennial Conference, Volume on Fluid Mechanics and 
Statistical Methods in Engineering, pp. 97-124, (University 
of Pennsylvania Press, Philadelphia). 

Simon, Leslie E. (1941), An Engineer's Manual of Statistical 
Methods, (John Wiley & Sons, Inc., New York, N.Y.). 

Simon, Leslie E. (1942), Application of statistical methods 
to ordnance engineering. J. Am. Stat. Assoc. 37, pp. 
313-324. 

Simon, Leslie E. (1946), On the relation of instrumentation 
to quality control, Instruments 19, pp. 654-656 (Nov. 
1946) ; reprinted in Photogrammetric Engineering XVIII, 
pp. 566-573 (June 1952). 

Spinks, A. W. and T. L. Zapf (1954), Precise comparison 
method of testing alternating-current watthour meters. J. 
Research NBS 53, pp. 95-105. 

Student (1908), The probable error of a mean, Biometrika 
VI, No. 1, pp. 1-25. 

Student (1926), Mathematics and agronomy. Journal of the 
American Society of Agronomy XVIII, 703-719. 

Student (1927), Errors of routine analysis. Biometrika, 
XIX, pp. 151-164. 

Swindells, James F. (1959), Calibration of liquid-in-glass 
thermometers, NBS Circ. 600, pp. 11-12, (U.S. Govern- 
ment Printing Office, Washington 25, D.C.). 

Tukey, J. W. (1960), Conclusions vs. decisions, Techno- 
metrics 2, No. 4, pp. 423-433. 

Waidner, C. W. and H. C. Dickinson (1907), On the standard 
scale of temperature in the interval 0° to 100° C, Bui. 
Bur. Stds. 3, pp. 663-728. 

Webster's Dictionary of Synonyms (1942, 1st ed.), (G. and 
C. Merriam Co., Springfield, Mass.). 

Youden, W. J. (1950), Comparative tests in a single lab- 
oratory, ASTM Bulletin No. 166, pp. 48-51. 

Youden, W. J. and J. M. Cameron (1950), Use of statistics 
to determine precision of test methods, Symposium on 
Application of Statistics, Special Technical Publication 
No. 103, pp. 27-34, (American Society for Testing Ma- 
terials, Philadelphia) . 



186 



Youdeii, W. J. (1951a), Statistical Methods for Chemists, 

(John Wiley & Sons, New York, N.Y.). 
Youden, W. J. (1951b), Locating sources of variabiUty in a 

process, Ind. Eng. Chem. 43, pp. 2059-2062. 
Youden, W. J. (1953), Sets of three measurements: The 

Scientific Monthly LXXVII, pp. 143-147. 
-Youden, W. J. (1954-1959), Statistical Design, Industrial 

and Engineering Chemistry, Feb. 11)54 to Dec. 1959, 

Bimonthly articles collecte(i in a singk^ booklet available 

from Reprint Department, AC'S Applied Publications, 

1155 Sixteenth St., Washington 6, D.C. , 
Youden, W. J., W. S. Connor, and N. C. Severo (1959), 

Measurements made by matching with known standards, 

Technometrics 1, pp. 101-109. 
Youden, W. J. (1960), The sample, the procedure, and the 

laboratory, Anal. Chem. 32, pp. 23A-37A. 
Youden, W. J. (1961a) How to evaluate accuracy. Mat. Res. 

& Std. 1, pp. 268-271. 
Youden, W. J. (1961b), What is the best value? J. Wash. 

Acad, of Sci. 51, pp. 95-97. 
Youden, W. J. (1961c), Statistical problems arising in the 

establishment of physical standards, Proceedings Fourth 

Berkeley Symposium on Mathematical Statistics and 

Probability III, pp. 321-335 (University of California 

Press, Berkeley and Los Angeles.) . 



Youden, W. J. (196 Id), Systematic errors in physical con- 
stants, Phys. Todav 14, pi). 32-42 (1961); also in Tech- 
nometrics 4, pp. 111-123 (1962). 

Youden, W. J. (1962a), Experimentation and Measurement, 
National Science Teachers Association Vistas of Science 
Series No. 2, (Scholastic Book Services, New York 36, 
^N.Y.). 

Youden, W. .1, (19621)), Measurement Agreement Compari- 
sons, presented at the Standards Laboratory Conference, 
National Bureau of Standards, Boulder, Colo., August 8- 
10, 1962. 

Youden, W. J. (1962c), Uncertainties in calibration, IRE Trans. 
Instr. I-ll, pp. 133-138 (1962). 

Youden, W. J. (1962d), Reahstic estimates of errors in meas- 
urements, ISA Journal 9, No. 10, pp. 57-58. 



(Paper 67C2-128) 



187 



JOURNAL OF RESEARCH of the National Bureau of Standards — C. Engineering and Instrumentation 

Vol. 67C, No. 2, April-June 1963 

Publications of the National Bureau of Standards* 



Selected Abstracts 

Quantitative metallography with a digital computer: applica- 
tion to an Nb-Sn superconducting wire, G. A. Moore and 
L. L. Wyman, J. Research NBS 67 A {Phys. and Chem.) 
No. 2, 127 {Mar -Apr. 1963). 70 cents. 

Accurate quantitative data pertinent to the structure of solid 
materials at the micro size level, which are difficult or pro- 
hibitive to obtain by traditional manual measurements, are 
now obtained directly by a digital computer which uses a 
photomicrograph as the information input. The history of 
picture interpretation experiments at the NBS is reviewed. 
The fundamental computer operations are illustrated, to- 
gether with a description of 24 image processing routines now 
functional at a practical level. 

A micrograph of a specimen of Nb-Sn superconductor wire is 
exhaustively analyzed. This specimen is found to contain 
approximately 70 percent Nb^Sn, nearly all of which is 
mutually interconnected. It is also found that in this 
specimen the mean free path in the NbsSn superconducting 
phase is only 26./) microns. This small value results from the 
spongy structure of the material and numerous interruptions 
caused by voids and by partich^s of 4 other solid phases. 
Th(; comparative importance of the several types of inter- 
ruptions is measured. It is determined that small voids are 
the most important single cause of the short mean free path, 
and deduced that these voids appear to have formed mainly 
from the reaction during heat treatment. 

Moire fringes produced by a point projection x-ray micro- 
scope, S. B. Newman, J. Research NBS 67A {Rhys, and 

Chem.) No. 2, 1^9 {Mar. -Apr. 1963). 70 cents. 
Moire fringes produced by soft X-rays passing through 
crossed gratings of fine wire mesh are demonstrated. Regular 
systems of bands appearing superimposed on radiomicro- 
graphs of oriented cellulosic structures may also be moire 
fringes. These fringes could be formed by fibrillate structures 
acting as crossed diffraction gratings. 

A method for determining the elastic constants of a cubic 
crystal from velocity measurements in a single arbitrary 
direction; application to SrTiOg, J. B. Wachtman, Jr., M. L. 
Wheat, and S. Marzullo, /. Research NBS 67A {Phys. and 
Chem.) No. 2, 205 {Mar. -Apr. 1693). 70 cents. 
A method is given for calculating the three elastic constants 
of a cubic crystal and their standard deviations from the 
three velocities of sound and their standard deviations 
measured in a single direction. The method is applicable to 
any direction except [1001 and [111]. 

A new type of computable inductor, C. H. Page, J . Research 
NBS 67B {Math, and Math. Phys.) No. 1, 31 {Jan.-Mar. 1863). 
75 cents. 

The mutual inductance analog of the generalized Thompson- 
Lampard theorem (for cross-capacitances) is developed. 
An infinitely loi^g cage of five parallel wires can yield an 
absolute inductance of 

3 + ^5 
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henries per meter. End-effects of order I//- occur in a 
finite cage, but can be reduced to order 1//'' by using eight 
wires. 

The eight wire cage has the advantage of overdetermined 
relations among the inductances to be measured, allowing an 
estimate of experimental error in the calibration of a standard. 
Errors due to faulty cage geometry are shown to be of the 
order of 1 in 10^. 



Input admittance of linear antennas driven from a coaxia 
line, T. T. Wu, ./. Research N US 071) {h'adij) Prop.) No. V 
83-89 {Jan.-Feb. 1963). 70 cents. 

In two cases of a linear antenna driven IVoin a coaxial line, 
it is shown that the apparent terminal admittance to the 
coaxial line can b(^ additively se])arated into two ])arts when 
the transverse dimensions are small conii)ared with the wave- 
length. One of these two parts depends only on the wave- 
length and the dimensions of the antenna, while the other 
part can be interpreted as a capacitance that d(;pends only 
on the radii of the coaxial line. This capacitance may be 
found exactly from the solution of an integral equation, in 
the sense that further corrections cannot be interpreted 
simply as a capacitance. 

Corrosion of steel pilings in soils, M. Romanoff, N BS Mono. 

58 {Oct. 24, 1962), 20 cents. 

Steel pilings have been used for many years as structural 
members of dams, floodwalls, bulkheads, and as load-bearing 
foundations. While its use is presumably satisfactory, no 
evaluation of the material after long service has been made. 
In cooperation with the American Iron and Steel Institute 
and the U.S. Corps of Engineers, the National Bureau of 
Standards has undertaken a project to investigate the extent 
of corrosion on ste(4 ])iles after numy years of service. 
Results of ins])ections made on steel ])ilings which have been 
in service in various und(4*ground structures under a wide 
variety of soil conditions for i)eriods of exposure up to 40 
years are presented. 

In general, no appreciable* corrosion of steel piling was found 
in undisturbed soil l)elow the water table regardless of the 
soil typ(*s or soil properties encountered. Above the water 
Uihlv and in fill soils corrosion was found to be variable but 
not serious. 

It is indicated that corrosion data previoulsy i)ublished by 
the National Bureau of Standards on specimens exposed 
under disturbed soil conditions do not apply to pilings which 
are driv^en in undisturbed soils. 

Radiation quantities and units, International Commission on 
Radiological Units and Measurements {ICRU) Report 10a, 
1962, NBS Handb. 84 (Nov. 14, 1962), 20 cents. 
This Handbook presents definitions of 23 fundamental 
radiation quantities and units. It resulted from a 3-year 
study by the Ad Hoc Committee on Quantities and Units of 
the ICRU. It includes new names for certain quantities 
and clarified definitions for others. It presents a system of 
concepts and a set of definitions which is internally consistent 
and yet of sufficient generality to cover present requirements 
and such future requirements as can be foreseen. 

A tabulation of the thermodynamic properties of normal 
hydrogen from low temperature to 300° K and from 1 to 100 
atmospheres, J. W. Dean, NBS Tech. Note 120 {Nov. 1961), 

45 cents. 

Pressure, volume, temperature, internal energy, enthalpy, 
and entropy of normal hydrogen gas have been tabulated 
along isobars in 1 °K temperature steps. The range covered 
is from the saturation temperature to 300 °K and from a 
pressure of 1 to 100 atmospheres. The source of data is the 
Research Paper 1932 of the National Bureau of Standards 
Journal of Research. The method is described by which the 
data presented in Research Paper 1932 is reduced to proper- 
ties directly useful for engineering calculations. A method is 
also described for estimating the effect of ortho-para compo- 
sitions upon the tabulated properties. 

Tabular values are presented in the dimensional units of the 
metric system. The tabulations are also available in the 
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dimensional units of the British system as Technical Note 
No. 120, Supplement A. 

Controlled temperature oil baths for saturated standard cells, 

P. H. Lowrie, Jr., NBS Tech. Note I4I {Aug. 1962), 25 cents. 
Two oil baths for the temperature control of saturated stand- 
ard cells have been designed and fabricated at the Boulder 
Laboratories of the National Bureau of Standards for opera- 
tion at 28 °C and 35 °C respectively. Short term control to 
better than ±0.001 °C with day-to-day variations no greater 
than 0.002 °C has been achieved with the use of a mercury- 
toluene thermoregulator incorporating a temperature antici- 
pating device. The circulating system limits temperature 
gradients in the oil to less than 0.001 °C across any 10-inch 
section. The baths incorporate preheat and drain tanks as 
well as the main temperature regulated tank to facilitate the 
insertion and removal of cells and to minimize oil spillage. 

Coordinated color identifications for industry, K. L. Kelly, 
NBS Tech, Note 152 (Nov. 1962), 15 cents. 
When a color is to be identified, the preciseness required of 
the identification is the first consideration. Usually this is 
determined by a trial-and-error method which can be both 
costly and time-consuming. For some uses, a color name 
consisting of a hue name or a hue name and modifier is sufficient 
while for others, a notation of the color in a color-order sys- 
tem will suffice. Where maximum precision is required, the 
color should be measured instrumentally and the results ex- 
pressed numerically. This paper describes the coordinated 
series of five levels of fineness of color identification developed 
by ISCC Subcommittee for Problem 23, the Expression of 
Historical Color Usage, and is based on the ISCC-NBS 
method of designating colors. It lists the methods for 
changing from one level to another and gives examples of the 
use of each level. 

The thermodynamic properties of helium from 6 to 540° R 

between 10 and 1,500 psia, 1). B. Mann, NBS Tech. Note 

154 A {Jan. 1962), 50 cents. 

The specific volume, enthalpy, entropy, and internal energy 

values of helium are presented in tabular form as functions of 

pressure and temperature. 

Data are tabulated in two-degree Rankin increments for 36 

isobars between 10 psia and 1,500 psia. A comparison with 

previously published data is made where applicable. 

An expression is presented which represents the pressure- 

density-teniperature surface based on previously published 

data. 

The tabulation is presented in the dimensional units of the 

British system but is also available in the dimensional units 

of the metric system. 

Emission stabilization of thermionic diode noise sources, 

M. W. Randall and M. G. Arthur, NBS Tech. Note 160 
{Sept. 1962), 15 cents. 

An apparatus is described which is capable of stabilizing the 
d-c plate current of a temperature-limited thermionic diode 
noise source to better than 0.02 percent, which corresponds 
to a noise power stability of better than 0.001 db throughout 
the current range of 1 ma to 100 ma. 

Evaluation of unexpectedly large radiation exposures by 
means of photographic film, W. L. McLaughlin, NBS Tech. 
Note 161 {Aug. 1962), 15 cents. 

Conventional film types used in personnel monitoring film 
badges are suitable for measuring X- and 7-radiation expo- 
sures only up to 1,000 R. By using special processing pro- 
cedures, it is possible to extend the range of the less sensitive 
component of most commercial film packets up to at least 
10,000 R. Limitations in precision of readings due to changes 
in rate dependence, energy dependence, and changes in the 
shape of the characteristic curve in this range are discussed. 

Exchange behavior of kaolins of varying degrees of crystal- 
linity, W. C. Ormsby, J. M. Shartsis, and K. H. Woodside, 
J. Am. Ceram. Soc. 45, No. 8, 361-366 {Aug. 1962). 
Particle-size fractions of several Georgia kaolins, which were 
prepared by sedimentation procedures, were examined from 
the standpoint of crystallinity, cation-exchange capacity, 
and surface area. Crystallinity was studied using X-ray 
techiiiques, exchange capacities were measured using the 



manganese saturation method, and surface areas were de- 
termined using glycerol adsorption techniques. A linear 
relation was obtained between surface areas and exchange 
capacities when areas were increased by decreasing the par- 
ticle size or by changing from well crystallized to poorly 
crystallized kaolins. In most cases, the empirically deter- 
mined crystallinity ratios indicated a change in crystallinity 
with change in particle size, the crystallinity generally in- 
creasing with decreasing particle size in individual samples. 
However, unlike the relation noted for exchange, the crys- 
tallinity did not consistently correlate with area changes 
both among the various samples and within the various 
particle size fractions of a single sample. These results sug- 
gest that the relatively high cation exchange capacities of 
poorly ordered kaolins is more directly a result of high surface 
areas with crystallinity playing, at most, a very minor role. 

Microwave measurements in the NBS Electronic Calibration 
Center, R. E. Larson, Inst. Elec. Engrs. 109, Ft. B, Suppl. 
No. 23, 6U-650 {1962). 

In the Electronic Calibration Center of the National Bureau 
of Standards Radio Standards Laboratory, Boulder, Colo., 
work is proceeding towards the establishment or extension of 
calibration services over a broad range of frequencies in the 
microwave region. At the present time, measurements can 
be made over limited portions of the frequency spectrum for 
the quantities of low-level c.w. power, reflection coefficient, 
frequency and attenuation. Calibration services are cur- 
rently provided for all of these quantities. Instrumentation 
for the measurement of microwave noise is near completion. 
A survey is given of the microwave measurement techniciues 
employed in this work. 

A modulated sub-carrier technique of measuring micro- 
wave attenuation, (1. E. Schafer and R. R. Bowman, Inst. 
Etec. Engrs. 109, Pt. B, Suppt. No. 23, 788-786 {1962). 
A method of measuring microwave attenuation is proposed 
which has the advantages of an ^-/ substitution method with 
single-sideband operation. However, ordinary amplitude 
modulation is used, and neither the carrier nor one of the 
sidebands needs to be suppressed. 

Two versions of instrumenting this method are described and 
some operational hints are given. One of these versions is 
capable of high accuracy with commercially available equip- 
ment. The proposed method was tested by measuring the 
relative attenuation of a microw^ave variable attenuator at 
9.3897 Gcps, attaining a precision of 0.0001 db at 0.01 db 
and 0.2 db at 50 db. The measurements are compared with 
calibrations performed by other methods. A special com- 
parison with values obtained from d-c substitution techniques 
was made in which environmental effects were largely elimi- 
nated. 

Factors affecting the accuracy of measurements made by 
this technique are discussed. Some of the precautions neces- 
sary to attain high accuracy are given. 

A survey of microwave power-measurement techniques 
employed at the National Bureau of Standards, G. F. Engen, 
Inst. Elec. Engrs. 109, Pt. B, Suppl. No. 23, 734-739 {1962). 
The bolometric technique of power measurement is an im- 
portant part of the microwave art. The paper describes 
certain refinements and extensions of this basic method which 
have been developed at the Boulder Laboratories of the 
National Bureau of Standards and which provide the basis 
for a microwave power-calibration service. 
The attendant problems may be divided into three categories: 
(i) measurement of the substituted or bolometric bias power, 
(ii) evaluation of the d.c. r.f. substitution error, and (iii) 
determination of the bolometer mount efficiency. 
In the first area, the Laboratory has developed precise and 
automatic d.c. instrumentation which permits an accuracy 
of 0-1 percent to be realized on a routine basis. In the 
second and third areas, microcalorimetric techniques enable 
a determination of the total microwave power dissipated 
within the bolometer mount; this, when compared with a 
simultaneous bolometric measurement, determines the com- 
bined effect of the substitution error and mount efficiency. 
Another interesting tool for evaluating bolometer-mount 
efficiency is provided by the Kerns impedance method. The 
implementation of this technique has always proved a real 
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chalhiu^v, l)ut recent applications of modified reflectometer 
techniques to the problem have resulted in improved accuracy. 
Consistent agreement, within a half percent, with micro- 
calorimetric determinations has been realized at X-band 
frequencies. 

Because a bolometer completely absorbs the power being 
measured, the problem of comparing or calibrating a bolom- 
eter mount in terms of a second mount is inherently more 
difficult than that of comparing two voltmeters or ammeters 
where simultaneous observations of the same quantity are 
possible. This problem has, in fact, been a major limitation 
to the accuracy so far achieved in the art. 
Two methods of dealing with this problem have been de- 
veloped which employ directional couplers and related 
techniques. 

A variable-parameter direct-current switching filter, G. F. 

Montgomery, Proc. IRE 50, No. 9, 1986 (Sept. 1962). 
In a direct-current circuit controlled by a switch, the contact 
arc is suppressed and the current transient is modified by 
using a variable-parameter switching filter. A rectifier varies 
the network structure during contact make and break. 

Synthesis of an immittance function with two negative 
impedance converters, S. B. Geller, IRE Trans. Circuit 
Theory CT-9, No. 3, 291 (Sept. 1962). 

A technique is shown for synthesizing a funclion such as 
Y(s) -- 1 through the use of two negative impedance con- 

s 
verters. Kinariwala's method is used with a limit process 
when the finiction fails in one of the required constraints. 

The spontaneous marten^itic transformations in 18% Cr, 
8% Ni steels, R. P. Reed, Acta Met. 10, 865-877 (Sept. 1962). 
On cooling 18 percent Cr, 8 percent Ni steels transform 
martensically to two products (e and a'). Sheets representing 
either e or stacking fault clusters have been observed to form 
prior to the formation of a' . Photographic sequences demon- 
strating the formation of a.' from these sheets are included. 
Some transformation characteristics of both e and a' are 
discussed. 

Th(^ mori)hology of the «' has been determined. It was 
found that the a' formed as long, narrow plates and that 
these plates were bounded by 111 sheets. The long direction 
of the plates was parallel or nearly parallel to <^ 110^. If 
they were parallel to <110> then the plates had 225 habit 
planes. It they deviated from <110> then the habit plane 
was not (225), possible alternate habit planes are plotted. 
In addition, the [111] habit plane was observed. 

A high speed pyrometer, G. A. llornbeck, Book, Temperature, 
It^ Measurement and Control in Science and Industry III, 
Pt. 2, A25-I^28 (Reinhold Puhl. Corp., New York, N.Y., 1962). 
A new high-speed selective spectrometer employed as a 
multiwavelength pyrometer is described. This instrument 
is essentially a device which permits the sequential measure- 
ment of a number of narrow bands of radiation at any chosen 
wavelengths at a high rate of speed. The prototype instru- 
ment which is described was designed to demonstrate a three 
wavelength pyrometer with a presentation rate permitting 
1,000 temperature determinations per second. 

The viscous heating correction for viscometer flows, E. A. 

Kearsley, Trans. Soc. Rheology VI, 253-261 (1962). 

A method is demonstrated for solving simple steady flows of 

incompressible Newtonian fluid with viscous heating. As an 

example, a generalization of Poiseuille flow is solved in simple 

terms. 

Wavelengths, energy levels, and pressure shifts in mercury 

198, V. Kaufman, J. Opt. Soc. Am. 52, No. S, 866-870 (Auq. 
1962). 

The vacuum wavelengths of 27 lines of 11 g^'^ and lines of 
Kr^^ have been measured relative to the i»iternational stand- 
ard of length, tlie Kr"^^ line at 6057.80211 a, by photographic 
Fal)ry-Perot interferometry. These measurements were 
made with llgi*^ electrodeless lamps containing argon at 
pressures of ]a, ''\, and 10 nmi Ilg and a Kr*^*^ hot-cathode 



lamj) operated according to the conditions i)!'esci-ibe(l by tiie 
International Conference on Weights and Measures in liXH). 
Energy-level values have been derived from the wavelengths 
of each of the Hg^^*^ sources, and on the basis of these values, 
the energy level and wavelength shifts per unit pressure of 
argon have been calculated. The suitability of the Iliz;"^ 
electrodeless lamp as a source of wavelength standards for 
interferometric measurement of length and wavelength is 
discussed. 

A network transfer theorem, G. F. Montgomerv, IRE Trans. 
Audio AU-10, No. 3, 88 (May-June 1962). 
For a linear, passive, reciprocal two-port network, the forward, 
open-circuit voltage transfer ratio is equal to the reverse, 
short-circuit current transfer ratio. 

Strengthening of hot work die steels, C. R. Irish and S. J. 
Rosenberg, Trans. Quart. 55, No. 3, 613-623 (Sept. 1962). 
A study of four hot-work die steels of the 5 percent chromium 
type showed that all retained a high percentage of their room 
temperature strengths at temperatures up to 800 °F. 
At 600 °F, the 1,000-hour stress-rupture life was in excess of 
98 percent of the short-time tensile strength at that tempera- 
ture. At 800 °F, failures were obtained at stresses between 
85 and 98 percent of the short-time strength. 
Specimens that survived 1,000 hours in the stress-rupture 
machines were subsequently tensile tested at room tempera- 
ture. The results obtained indicated that the strength of 
these specimens had })een significantly increased. 

Acoustical interferometer employed as an instrument for 
measuring low absolute temperatures, G. Cataland and 
H. II. Pluml), ./. Aronsi. Soc. Am. 34, No. 8, 1145-1146 

(Aug. 1962). 

Values of absolute teini)eratures at 2° and 20 °K have been 
determined from experimental measurements of the speed of 
sound as a function of pressure in helium gas. The acoustical 
interferometer was tln^ instrument employed in the measure- 
ments, and the accuracy achieved in the experiment indicates 
that sonic thermometry at low temperatures may be competi- 
tive with other conventional thermometry techniques. 

Correlation of factors influencing the pressures generated in 
multi-anvil devices, J. C. iloucix and IT. O. Ilutton, Am. 
Soc. Mcch. Engrs. Paper 62-\VA-254 (1962). 
Tests were performed with three different multianvil wedge- 
type high pressure devices, using pyrophyllite as the sample 
holder. Two devices made use of tetrahedrons, nominally 
]{'' and \" on an edge; the third used a nominal %" cube. 
The change in electrical resistance was used to detect the 
transitions of bismuth T-II (25.2 kb) bismuth II-III (26.6 kb) 
and barium (59 kb). Major effects observed were: (1) Oven 
drying the pyrophyllite^ sample holders to remove moisture 
gave significantly lower anvil forces to reach the transition. 
(2) A silver chloride sleeve not only caused transition to go to 
completion for smaller increases in applied anvil forces, but 
also reduced the anvil forces reciuired to reach the bismuth 
II-III and the barium (59 kb) transition pressures. (8) A 
wide range of sizes of sample holders in the same size die had 
little effect on anvil force required to reach the transition. 
(4) Comparison of results with the }{'' and V tetrahedrons 
showed that the ram loads required to attain the transitions 
were proportional to the face areas of the anvils. 
A ''two-stage" device was constructed by insertion of hardened 
steel truncated cones in the faces of the pyrophyllite tetrahe- 
drons. This arrangement permitted the attainment of the 
bismuth 88 kb transition with the ram load reduced to about 
one-half of that required for the single-stage arrangement. 

The temperature dependence of flow and fracture character- 
istics of an age-hardenable alloy, W. D. Jenkins and W. A. 
WiUard, Trans. ASM, 55, No. 3, 580-598 (Sept, 1962). 
Mechanisms contributing to flow, fracture and ductility of 
poly crystalline Duranickel tensile specimens tested in the 
teniperature range 75° to 1,200 °F are discussed. The 
temperature dependence of the yield point phenomenon and 
reversals in strength-temperature curves of th(^ annealed metal 
is attributed to precipitation of Nii^Al during deformation. 
Increase in strength and decrease in ductility due to aging 
are rationalized on the basis of the presence of precipitates 
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which interfere with the motion of dislocations. The density, 
distribution, shape and size of slip bands, precipitated parti- 
cles, cracks and cavities in the specimens before and after 
fracture were observed by means of optical and electron 
microscopy and are discussed by use of dislocation theory. 
The influence of aging on tensile deformation and of tensile 
deformation on aging is partially analyzed by means of 
hardness values. 

The effect of experimental variable including the martensitic 
transformation on the low-temperature mechanical properties 
of austenitic stainless steels, C. J. Guntner and R. P. Reed, 

Trans, ASM 55, No. 3,399-491 {Sept. 1962). 
The austenitic stainless steels in general are excellent low 
temperature structural materials. Most of these steels 
undergo a martensitic transformation which produces both a 
hexagonal close-packed phase and a body centered phase and 
affects the mechanical properties. 

Tensile, notched-tensile and impact tests have been conducted 
on AISI 202, AM 350, USS Tenelon, AISI 310 and 5 com- 
mercial grades of AISI 304 and AISI 304L at temperatures 
between 300 and 4° K. Values are obtained for tensile 
strength, notched-tensile strength, impact strength, tensile 
elongation, tensile reduction of area and notched-tensile- 
reduction of area. A plot is also included on the percent 
phase transformation as a function of tensile strain and tem- 
perature. Experimental variables considered are temperature, 
strain rate, specimen geometry and initial microstructure. 
The influences of the experimental variables and the mar- 
tensitic transformation characteristics on the mechanical 
properties are discussed. 

Characteristics of resistance strain gages, R. L. Bloss, Book, 
Semiconductor and Conventional Strain Gages, Ed. Mills 
Dean, III and R. D. Douglas, chapt. VII, 123-142 {Academic 
Press, Inc., New York, N.Y.,'Oct. 1962). 

Although resistance strain gages are very useful devices for 
many applications, their characteristics and limitations must 
be examined closely when use in a new situation is contem- 
plated. The characteristics and factors which may limit the 
usefulness of these gages include (1) strain sensitivity, (2) 
temperature sensitivity, (3) resistance instability, (4) shelf 
life of components, (5) effects of moisture, (6) incompatibility 
of components, (7) fatigue life, (8) frequency range, (9) mag- 
netostrictive effects, and (10) incompatibility with environ- 
ment. These factors are discussed and illustrated. 

Standard tests for electrical properties, A. H. Scott, SPE J. 
1375-1378 {Nov. 1962). 

A discussion is given of the use of standard tests to deter- 
mine (1) volume and surface resistivity, (2) permittivity 
(dielectric constant) and dissipation factor, (3) dielectric 
strength and (4) arc resistance or tracking. Both American 
and International Tests are cited. 

Chromium plating by thermal decomposition of dicumene 
chromium, W. H. Metzger, Jr., Plating 49, No. 11, 1176 
{Nov. 1962). 

A technical note describing experiments on chromium plating 
by thermal decomposition of dicumene chromium. 

New wavemeter for millimeter wavelengths, R. W. Zimmerer, 
Rev. Sci. Insir. 33, No. 8, 858-859 {Aug. 1962), 
A new wavemeter of simple design is described. The princi- 
ple of operation makes use of a ne\^' development in physical 
optics. The actual performance of the device was measured 
and compared with the theor3\ 

The use of a Venturi tube as a quality meter, R. V. Smith, 
P. C. Wergin, J. F. Ferguson, and R. B. Jacobs, J. Basic 
Eng. 84, 411-412 {Sept. 1962). 

It is shown that the relationship between the pressure drop 
Ap, the mass rate of flow m, and the quality x, of the two- 
phase fluid flowing through a Venturi can be correlated by 
the following expression 

i-\-hx 

It follows that if the pressure drop and mass flow rate are 
measured, the quality is easily computed. 



Dislocation loops in deformed copper, A. W. Ruff, Jr., Fifth 
Intern. Congress for Electron Microscopy, p. J -10 {Academic 
Press, Inc., New York, N.Y., 1962). 

An examination by transmission electron microscopy of single 
crystal copper foils which were deformed 12 percent and 20 
percent by rolling, has revealed the presence of considerable 
numbers of small dislocation loops. Average values are given 
for the dislocation density and the loop density. It is believed 
that these loops were formed from point defects generated during 
the deformation, and that the point defect concentration 
immediately after deformation was at least 10~'\ The 
diffraction contrast effects associated with the loops indicate 
that the dislocations are complete and not partial. 

Applications of resistance thermometers to calorimetry, 

G. T. Furukawa, Book, Temperature, Its Measurement and 
Control in Science and Industry III, Pt. 2, 317-328 {Reinhold 
Publ. Co., New York, N.Y., 1962). 

The importance of the resistance thermometer in the accurate 
measurement of both temperature and the heat leak of the 
calorimeter is discussed. The final accuracy of the determina- 
tion of heat capacity is shown to be dependent upon the 
accurate and consistent measurement of heat input to the 
sample and the corresponding rise in temperature. The 
various heat-capacity calorimeters used in the range from 10 
to 400 °K are briefly described with emphasis upon the 
applications of resistance thermometers, the methods for cali- 
brating them, and the problems associated with the design of 
calorimeter vessels. Comparison is made of the thermometric 
properties of platinum, copper, indium, lead and gold-silver 
alloy. The need for high relative accuracy in the measure- 
ment of A /^ is emphasized. The various temperature scales 
used in calorimetry are compared and their applications are 
described. Absorption spectrum of carbon vapor in solid 
argon at 4° and 20 °K, R. L. Barger and H. P. Broida, J. 
Chem. Phys. 37, No. 5, 1152-1153 (Sept. 1, 1962). 

Obtaining the internal junction characteristics of a transistor 
for use in analog simulation, S. B. Geller, IRE Trans. 
Electron. Computers EC-11, No. 5, 709-710 {Oct. 1962). 
A technique is described for making the internal base-to- 
emitter junction characteristics of an alloy junction transistor 
available to an analog computer simulation process. This is 
accomplished with an active feedback network that continu- 
ously compensates for the internal voltage drop across the 
extrinsic base-spreading resistance at all base current levels. 

The thermodynamic scale of temperature below 1 °K, 

R. P. Hudson, Book, Temperature, Its Measurement and 
Control in Science and Industry III, Pt. I, 51-57 {Reinhold 
Publ. Co., New York, N.Y., 1962). 

Following a brief discussion of the principles of magnetic 
thermometry, a description is given of the main methods use 
to derive the relation between the ^ ^magnetic scale" and the 
absolute scale of temperature. Experimental results pub- 
lished since 1953 are summarized. An account is given of 
the measurement of absolute temperature using the anisotropy 
of radiation emitted from oriented radioactive nuclei. Recent 
work on the intercomparison of '^nuclear orientation scales", 
and on the comparison of one such scale with a magnetic 
scale, is reviewed. There follows a short account of the 
adaptation of the magnetic-cooling method to nuclear para- 
magnetics and the production of temperatures of the order 
of one microdegree Kelvin. 

Precision phase meter, D. M. Waters, D. Smith, and T4. C. 
Thompson, Jr., IRE Tran^. Instr. I-II, 64-66 {Sept. 1962). 
A precision electromechanical phase meter has been developed 
to record slow, continuous phase variations often encountered 
in radio propagation research. The phase meter will follow 
phase variations up to several complete cycles unambiguously 
and small phase variations as fast as 1 c/s. 

The effect of temperature and humidity on the oxidation of 
air-blown asphalts, P. G. Campbell, J. R. Wright, and 
P. B. Bowman, Mater. Res. Std. 2, No. 12, 988-995 {Dec. 



The effects of temperature and humidity on the oxidation of 
air-blown roofing asphalts were determined by measuring the 
changes in infrared absorption in the carbonyl band caused by 
carbon-arc exposure of the asphalts under varying conditions 
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of tempoi-Htiire and relative huiiiidity. Asphalt oxidation 
was measured for both fixed periods of exposure, and as a 
function of exposure time. Both temperature and humidity 
affected the rate of asphalt oxidation, with temperature being 
the more critical parameter. The sensitivity of the asphalts 
to changes in temperature and relative humidity varied with 
asphalt source and asphalt durability. The relative order of 
oxidation stability of a series of asphalts exposed outdoors 
was, in general, the same as that obtained with exposure to 
carbon-arc radiation. The formation and subsequent de- 
composition of an ashpalt-oxygen-water complex is proposed 
as a possible mechanism for the effects of temperature and 
relative humidity on asphalt oxidation rates. 

The thermodynamic temperature scale, its definition and 
realization, C. M. Ilerzfeld, Book, Temperature, Its Measure- 
ment and Control in Science and Industrie III, Pt. I, 41-50 
(Reinhold Puhl Co., New York, N.Y., 1962). 
A temperature scale based on thermodynamics is conceptually 
straightforward. The usual definition is given and the 
realization of the scale by means of gas thermometry is dis- 
cussed. 

The scale can be extended by appeal to the statistical me- 
chanical interpretation of thermodynamics. Ways of doing this 
for high and low temperatures, using radiation and magnetic 
methods respectively, are presented. 

Statistical mechanical arguments make possible the use of 
the concept of temperature for systems differing greatly from 
those contemplated in classical thermodynamics. 

Use of a "peek-a-boo" information retrieval technique for a 
personal reference file, J. A. Bennett, J . Wash. Acad. Sci. 
52, No. 9, 216-219 {Dec. 1962). 

Optical coincidence subject cards have many advantages for 
indexing a personal reference file. A system having a capac- 
ity of 1,500 items has proven very useful and does not 
require complicated punching and searching equipment. 

Photographic strain measuring technique for use above 3,000 

F, L. Mordfin and T. Rubusto, Jr., Proc. Instr. Soc. Am. 17, 
Pt. 1, 3.4.62-1 {1962). 

A technique for measuring local surface strains in a structural 
test specimen is proposed, in which gage point markings ap- 
plied to the surface of the specimen are photographed. This 
paper describes an exploratory application of this method 
to the measurement of axial and lateral strains in the tensile 
test of a molybdenum rod at 3,500 F in a vacuum. 

The speed of processes involved in electroplating: movement 
of solute, attainment of the steady state and formation of 
metal, A. Brenner, 4^Jth Annual Tech. Proc. Am. Fjlcclroplalcr^ 
Soc. p. 9-18 {1962). 

The time involved in various processes occurring during or- 
dinary electroplating was discussed. Ions moved to the 
cathode at the rate of 10""* cm/sec. The upward move- 
ment of convection currents along an electrode was about 6 
cm per minute. The time required to reach a steady state 
of electrolysis was about 2 minutes. Deposition of metal can 
be made to occur with a microsecond pulse of current. By 
means of galvanostagometry is was shown that an electrode 
reaction occurs in less than 5 microseconds after a circuit is 
closed. 

Journal of Research 67 A (Phys. and Chem.), No. 1 (Jan.- 
Feb. 1963), 70 cents. 

Heat of formation of calcium aluminate monosulfate at 25 °C. 

H. A. Herman and E. S. Newman. 
2,3-Dimethylpentane and 2-methylhexane as a test mixture 

for evaluating highly efficient fractionating columns. E. C. 

Kuehner. 
Phase equilibrium relations in the Sc203-Ga203 system. S. J. 

Schneider and J. L. Waring. 
Analysis of two infrared bands of CIi2D2- Wm. B. Olson, 

H. C. Allen, Jr., and E. K. Plyler. 
Precise coulometric titrations of halides. G. Marinenko and 

J. K. Taylor. 
Radial distribution study of vitreous barium borosilicate. 

G. J. Piermarini and S. Block. 



Dynamic compressibility of poly (vinyl acetate) and its rela- 
tion to free volume. J. E. McKinney and H. V. Belcher. 

An investigation of the constitution of the mercury-tin svstem. 
^ D. F. Taylor and C. L. Burns. 

Effect of methyl bromides additions on the flnuie speed of 
methane. C. Ilalpeni. 

Journal of Research 67A (Phys. and Chem.), No. 2 (Mar.- 
Apr. 1963), 70 cents. 

Third spectrum of palladium (Pd iii). A. G. Shenstone. 

Broadening of the rotational lines of carbon monoxide by 
HCl and argon. R. J. Thibault, J. 11. Jafrc, and E. K. 
Plyler. 

Theory of frustrated total reflection involving metallic sur- 
faces. T. R. Young and B. D. Rothrock. 

Quantitative metallography with a digital computer: applica- 
tion to a Nb-Sn superconducting wire. G. A. Moore and 
L. L. Wyman. (See above abstracts.) 

Moire fringes produced by a point projection X-ray micro- 
scope. S. B. Newman. (See above abstracts.) 

Cyclic polyhydroxy ketones. I. Oxidation products of 
hexahydroxybenzene (benzenehexol). A. J. Fatiadi and 
H. S. Isbell. 

Effect of pressure and temperature on the refractive indices 
of benzene, carbon tetrachloride, and water. R. M. 
Waxier and C. E. Weir. 

Pressure-density-temperature relations of fluid para hydrogen 
from 15 to 100 °K at pressures to 350 atmospheres. R. D. 
Goodwin, D. E. DiUer, H. M. Roder, and L. A. Weber. 

A method for determining the elastic constants of a cubic 
crystal from velocity measurements in a single arbitrary 
direction; application to SrTiOs. J. B. Wachtman, Jr., 
M. L. Wheat, and S. Marzullo. (See above abstracts.) 

Journal of Research 67B (Math, and Math. Phys.), No. 1 
(Jan.-Mar. 1963), 75 cents. 

Evaluation of a generalized elliptic-type integral. L. F. 
Epstein and J. H. Hubbell. 

An algorithm for obtaining an orthogonal set of individual 
degrees of fr(;edom for error. J. M. Cameron. 

Recognition of completely mixed games. A. J. Goldman. 

A new type of computable inductor. C. II. Page. (See 
above abstracts.) 

Numerical computation of the t('m|)()ral dcx-elopment of 
currents in a gas discharge tube. \V. Bcu'ch-Supan and 
IE Oser. 

Tabkis of genera of groups of linear fractional transforma- 
tions. II. Fell, M. Newman, and E. Ordman. 

Journal of Research 67D (Radio Prop.), No. 1 (Jan.-Feb. 
1963), 70 cents. 

A lunar theory reasserted — a rebuttal. J. V. Evans. Point- 
to-point communication on the moon. L. E. Vogler. 

HF communication during ionospheric storms. G. K. Hill. 

Use of surface refractivity in the empirical prediction of total 
atmospheric refraction. W. R. Iliff and J. M. Holt. 

Effective sunspot numbers. W. B. Chadwick. 

On the theory of radio wave propagation over inhomogeneous 
earth. K. Furutsu. 

Fields of electric dipoles in sea water (a correction). W. 
Anderson. 

Composition of reflection and transmission formulae. J. 
Heading. 

Titheridge coefficients for the polynomial method of deducing 
electron density profiles from ionograms. A. R. Long and 
J. 0. Thomas. 

Input admittance of linear antennas driven from a coaxial 
line. T. T. Wu. (See above abstracts.) 

Journal of Research 67D (Radio Prop.), No. 2 (Mar .-Apr. 
1963), 70 cents. 

The protection of fre([uencies for radio astronomy. R. L. 

Smith- Rose. 
Radar reflections from the moon at 425 Mc/s. G. H. Millman 

and F. L. Rose. 
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Sunset and sunrise in the ionosphere: effects on the propaga- 
tion of longwaves. J. Rieker. 
Correction of atmospheric refraction errors in radio height 

finding. W. B. Sweezy and B. R. Bean. 
Empirical determination of total atmospheric refraction at 

centimeter wavelengths by radiometric means. A. C. Anway. 
Propagation of radiofrequency electromagnetic fields in 

geological conductors. V. Fritsch, translated from German 

by A. P. Barsis. 
WWV reception in the arctic during ionospheric disturbances. 

G. E. Hill and J. R. Herman. 
Height-gain for VLF radio waves. J. R. Wait and K. P. 

Spies. 
Perturbation method in a problem of waveguide theory. 

D. Fox and W. Magnus. 

Some wave functions and potential functions pertaining to 

spherically stratified media. C. T. Tai. 
Radiation from a plasma-clad axially-slotted cylinder. 

W. V. T. Rusch. 
Two- and three-loop superdirective receiving antennas. 

E. W. Seeley. 

Hallen's method in the problem of a cavity-backed rectangular 

slot antenna. J. Galejs. 
Relative convergence of the solution of a doubly infinite set 

of equations. R. Mittra. 



Periodicals received in the Library of the National Bureau of 

Standards, July 1962, N. J. Hopper, NBS Mono. 57 (Nov. 

23, 1962), 25 cents (Supersedes NBS Circular 563 and the 

1st supplement to NBS Circular 563). 
Handbook for CRPL Ionospheric Predictions Based on 

Numerical Methods of Mapping, S. M. Ostrow, NBS 

Handb. 90 (Dec. 21, 1962), 40 cents (Supersedes Circ. 465). 
Report of the 47th National Conference on Weights and 

Measures 1962, NBS Misc. Publ. 244 (Nov. 23, 1962), 

75 cents. 
Hydraulic research in the United States 1962, H. K. Middle- 
ton, NBS Misc. Publ. 245 (Oct. 26, 1962), $1.00. 
1962 Research Highlights of the National Bureau of Stand- 
ards, Annual Report, NBS Misc. Publ. 246 (Dec. 1962), 

70 cents. 
Quarterly radio noise data, March, April, May 1962 and 

corrigendum for Technical Notes 18-1 through 18-11, 

W. Q. Crichlow, R. T. Disney and M. A. Jenkins, NBS 

Tech. Note 18-14, (Aug. 9, 1962) 50 cents. 
Mean electron density variations of the quiet ionosphere. 

No. 8— October 1959, J. W. Wright, L. R. Wescott, and 

D. J. Brown, NBS Tech. Note 40-8, (Sept. 1962), 35 cents. 
Synoptic radio metrology, B. R. Bean, J. D. Horn, and 

■^L. P. Riggs, NBS Tech. Note 98, (Oct. 1962), 50 cents. 
Bibliography on direction finding and related ionospheric 

propagation topics, 1955-1961, O. D. Remmler, NBS Tech. 

Note 127, (Oct. 1962), 60 cents. 
Equatorial spread, F. W. Calvert, NBS Tech. Note 145 

(Aug. 1, 1962), 60 cents. 
The energy parameter B for strong blast waves, D. L. Jones, 

NBS Tech. Note 155, (July 1962), 25 cents. 
Thermal balance in the F region of the atmosphere, D. C. 

Hunt, NBS Tech. Note 162 (Sept. 1962), 50 cents. 
Spectrophotometric determination of hydroperoxide in di- 

ethyle ether, W. C. Wolfe, Anal. Chem. 34, No. 10, 1328- 

1330 (Sept. 1962). 
Analysis of the hvdroxyl radical vibration rotation spectrum 

between 3900 A and 15000 A, A. M. Bass and D. Garvin, 

J. Mol. Spectry. 9, No. 2, 114-123 (Aug. 1962). 
Structure and structure imperfections of solid /3-oxygon, E. M. 

Horl, Acta Cryst. 15, No. 9, 845-850 (Sept. 1962). 
Kinetics of Cs+ desorption from tungsten, M. D. Scheer and 

J. Fine, J. Chem. Phys. 37, No. 1, 107-113 (July 1962). 
Effect of additives on silver iodide particles exposed to light, 

G. Burley and D. W. Herrin, J. Appl. Meteorol. 1, No. 3, 
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